Secondary Qualitative Data Sources and How to Find Them

Definition

Secondary data analysis is defined as “analysis of data that was collected by someone else for another primary purpose” [1] . Secondary qualitative data analysis is used for, but not limited to, the continued in-depth analysis of previous data sets, to study additional subsets of original data, and to describe historical/contextual characteristics of populations and societies [2] .
For the purpose of this entry, qualitative secondary data sources will be divided into two genres: traditional sources and non-traditional sources. Traditional sources include qualitative data sets found in data repositories such as sets of interviews, transcripts, etc. Non-traditional sources include other sources of data not originally intended for research purposes.

Relevant Characteristics & Goals of the Method

Traditional Sources

Locating secondary data has become easier over the years as organizations build larger, more organized online data repositories. In the past, researchers would need to contact the original data collection team, gain approval for use of the documents, and then have the documents shipped to their location. Today researchers can log on to a variety of data repositories or data centers, type in the key terms they want to research, and narrow down the search results to a dataset that may answer the question they are asking.

These repositories store information related to the primary data collection including the data themselves, instruments used to collect the data, and contextual information including year and location the data were collected. Traditional data include those collected in surveys, interviews, video recorded sessions, and experiments. To find a data repository, simply search for “data repository” or “data archive” (searching for “qualitative data” or “data storage” will provide unwanted search results). We have provided a short list of prominent repository sites below under "Online Resources and Further Reading."

Many of these repositories have predetermined criteria regarding personal identifiers, open access, copyright, and long-term availability, but other repositories may have different criteria. Therefore, some data may have all the information needed to conduct a full secondary data analysis while other data do not. It is important to be aware of what data is included and what data can be augmented through analysis of non-traditional secondary data.

Non-Traditional Sources

When qualitative data repositories and archives do not seem to have material related to your research question(s), it might seem like a stop sign on the road. However, there is a myriad of sources that are usually overlooked by social scientists given the fact that the data they contained was not originally collected with the purpose of research. Data just lays there, waiting to be systematically analyzed. Think, for example, of the letters written by the soldiers during the Civil War. Although these archived documents were not specifically collected to answer your research question, a secondary qualitative analysis of those letters might give some valuable context to your research.

We arbitrarily decided to name these sources non-traditional, since they are not data archives in the traditional sense. These sources can be further subdivided into artifacts and social media contents. Artifacts are evidences of human experience and include:

NT short table.PNG

Social media, on the other hand, include interactive, user-driven sites such as Twitter, Facebook, YouTube, Instagram, Pinterest, and most dating apps. Regardless of which type of non-traditional data source you chose, you can perform the four general aims of qualitative research on these data pieces (i.e., explore, describe, compare, or test models) by defining a unit of analysis. In a classic sense, units of analysis can be people, groups, objects, or even time [3] ! For these non-traditional sources, units of analysis can be particular elements within the artifacts or social media, such as the text references to weather in the letters written by the soldiers during the Civil War or all the retweets given to the President’s posts during the State of the Union address.

A full description of units of analysis for the social media pieces are given in the table under "Examples of Secondary Data Use from Non-Traditional Data Sources".

"Method Made Easy"

Traditional Sources

To use traditional data sources for secondary data analysis, start by locating a data repository. Define the search parameters including subject/research domain, file format (if available), and data type (qualitative, quantitative) to locate an appropriate database. Once you have selected the database you want to search, open the link and define your search parameters for your specific area of interest. If your inital keyword search returns results, begin narrowing your search field until you locate a dataset appropriate for analysis. If the dataset includes all the information you need, begin process of gaining approved access to the dataset. Some institutes require Institutional Review Board (IRB) approval prior to data download, so check with your local IRB before obtaining data. Obtain data and begin analysis.
Traditional Flowchart.JPG

Non-Traditional Sources

To use non-traditional data sources for secondary data analysis, first determine the type of data you need. This search assumes that the research topic has already been selected and ethical issues have been considered prior to searching for data. If you are interested in Social Media, begin by selecting which social media outlet you want to analyze. If you are interested in Other types of secondary data like those listed in "Non-Traditional Sources" above, determine which types of materials you want to analyze. Once you have determined what material to analyze, begin defining the unit of analysis. Some institutes require Institutional Review Board (IRB) approval prior to data download, so check with your local IRB before obtaining data. Obtain data and begin analysis.
Nontraditional Flowchart.JPG

Advantages of Secondary Qualitative Data Analysis

  • Saves time: Rich data can be obtained with little work from secondary researcher [4]
  • Beneficial for grant preparation for future studies [5]
  • Datasets are useful for comparison of data sets, generalization of your own research, and are useful for the otherwise costly creation of longitudinal studies. [6] [7]
  • Gives researcher access to unreachable or sensitive populations where otherwise gaining access to the population may be difficult. Secondary analysis decreases the strain on vulnerable populations [8] [9]
  • No interaction with population of study: a common argument in qualitative analysis is that qualitative research is more convincing when there is less interaction between the researcher and population of study [10]

Limitations to Secondary Qualitative Data Analysis

  • Limited population control: as a secondary data researcher, you are limited to the populations researched in previous studies [11]
  • The purist view of qualitative analysis posits that qualitative data is only valid in the context in which it was collected. In this argument, secondary analysis of qualitative data cannot be valid [12]
  • Limited information about data set: the context in which the data was collected may not be clear and original researcher may not provide information regarding the conditions in which the data was collected [13]
  • Depending on how the data set is presented/shared, it may not be suitable for re-analysis. Notes in the margins and coding/memos written into the dataset, may obscure the original data [14]

Analysis

[Note: There are additional ‘ways’ to look at the data for non-traditional sources.] Qualitative content analysis is more than simply counting words or pulling out objective content from texts in order to look at meanings, themes and patterns that may be apparent or hidden in a specific text (Mayring, 2000). In any research project, it is always important to determine the unit of analysis [15] . The unit of analysis refers to the basic unit of text to be classified during qualitative content analysis [16] . Depending on the type of research conducted, the unit of analysis can be people, folk tales, cities, countries, newspapers, groups, objects or even time [17] . If the researcher carefully prepares the data, codes and interpret the data, qualitative content analysis will support the development of new theories and models, and can even validate existing theories and provide great descriptions of particular settings .

Method in Context

When considering secondary data sources, the most important question to ask yourself is ‘What is my unit of analysis?’ With traditional secondary sources, your unit of analysis are usually determined by the study design. If the design called for semi-structured interviews to be completed with participants, then your unit of analysis is the interviews. With non-traditional secondary sources, your unit of analysis is much more flexible. Depending on your research question or your purpose for using secondary data, your unit of analysis could be almost anything.

One of the biggest advantages of non-traditional secondary sources is the ability to determine whatever unit of analysis is appropriate for your study. Creativity can a huge asset in determining the source and unit of analysis you are going to use. For example, if your study wants to examine trends on twitter regarding a certain topic, there are many different units of analysis you might use. Actual tweets would be one unit analysis, as well as the person or organization tweeting, and the followers of the person or organization that tweeted. Additional subunits of analysis include the contents of the tweets, hashtags, length of tweet, retweets, quotes, and many more.
Since units of analysis are so flexible, especially for non-traditional sources, their utility can be flexible as well. Non-traditional sources can be the basis for your analysis and can provide rich and interesting data. Notably, one of the benefits of non-traditional sources is that they can also provide supporting, supplementary evidence for your research questions or study.

Online Resources and Further Reading

[Note: The examples provided are not exhaustive and are meant to provide a starting point for locating data. There are numerous repositories, archives, and types of non-traditional sources.]
Repositories:

Data Archives:

Examples of Secondary Data Use from Traditional Data Sources:
Traditional Data Table.PNG

Examples of Secondary Data Use from Non-Traditional Data Sources:
Social Media Table.PNG

Discussion Board/Comments


Subject Author Replies Views Last Message
No Comments



References


  1. ^ Smith, A.K., Ayanian, J.Z., Covinsky, K.E., Landon, B.E., McCarthy, E.P., Wee, C.C., Steinman, M.A., (2011). Conducting high-value secondary dataset analysis: an introductory guide and resources. Journal of General Internal Medicine, 26(8), 920-929.
  2. ^ Fielding, N. (2004) Getting the most from archived qualitative data: epistemological, practical and professional obstacles. International Journal of Social Research Methodology, 7(1), 97-104.
  3. ^ Bernard, H. R., & Ryan, G. W. (Eds.). (2010). Analyzing qualitative data: Systematic approaches. Sage.
  4. ^ Smith, A.K., Ayanian, J.Z., Covinsky, K.E., Landon, B.E., McCarthy, E.P., Wee, C.C., Steinman, M.A., (2011). Conducting high-value secondary dataset analysis: an introductory guide and resources. Journal of General Internal Medicine, 26(8), 920-929.
  5. ^ Smith, A.K., Ayanian, J.Z., Covinsky, K.E., Landon, B.E., McCarthy, E.P., Wee, C.C., Steinman, M.A., (2011). Conducting high-value secondary dataset analysis: an introductory guide and resources. Journal of General Internal Medicine, 26(8), 920-929.
  6. ^ Fielding, N. (2004) Getting the most from archived qualitative data: epistemological, practical and professional obstacles. International Journal of Social Research Methodology, 7(1), 97-104.
  7. ^ Smith, A.K., Ayanian, J.Z., Covinsky, K.E., Landon, B.E., McCarthy, E.P., Wee, C.C., Steinman, M.A., (2011). Conducting high-value secondary dataset analysis: an introductory guide and resources. Journal of General Internal Medicine, 26(8), 920-929.
  8. ^ Fielding, N. (2004) Getting the most from archived qualitative data: epistemological, practical and professional obstacles. International Journal of Social Research Methodology, 7(1), 97-104.
  9. ^ Smith, A.K., Ayanian, J.Z., Covinsky, K.E., Landon, B.E., McCarthy, E.P., Wee, C.C., Steinman, M.A., (2011). Conducting high-value secondary dataset analysis: an introductory guide and resources. Journal of General Internal Medicine, 26(8), 920-929.
  10. ^ Fielding, N. (2004) Getting the most from archived qualitative data: epistemological, practical and professional obstacles. International Journal of Social Research Methodology, 7(1), 97-104.
  11. ^ Smith, A.K., Ayanian, J.Z., Covinsky, K.E., Landon, B.E., McCarthy, E.P., Wee, C.C., Steinman, M.A., (2011). Conducting high-value secondary dataset analysis: an introductory guide and resources. Journal of General Internal Medicine, 26(8), 920-929.
  12. ^ Fielding, N. (2004) Getting the most from archived qualitative data: epistemological, practical and professional obstacles. International Journal of Social Research Methodology, 7(1), 97-104.
  13. ^ Fielding, N. (2004) Getting the most from archived qualitative data: epistemological, practical and professional obstacles. International Journal of Social Research Methodology, 7(1), 97-104.
  14. ^ Fielding, N. (2004) Getting the most from archived qualitative data: epistemological, practical and professional obstacles. International Journal of Social Research Methodology, 7(1), 97-104.
  15. ^ Bernard, H. R., & Ryan, G. W. (Eds.). (2010). Analyzing qualitative data: Systematic approaches. Sage.
  16. ^ Mayring, P. (2000, June). Qualitative content analysis. In Forum: Qualitative social research (Vol. 1, No. 2).
  17. ^ Bernard, H. R., & Ryan, G. W. (Eds.). (2010). Analyzing qualitative data: Systematic approaches. Sage.