Do you feel sluggish or lethargic most days? Are you susceptible to catching colds? If you blame the “daily grind” for why you’re feeling rundown, you might be overlooking a simple nutrient deficiency.

Iron is needed to make hemoglobin — part of our red blood cells that carries oxygen to the body’s tissues. In addition, iron works in conjunction with other micronutrients to make essential enzymes that impact our endocrine function and neurotransmitters and keeps us healthy by regulating our immune system.


Symptoms of iron deficiency are shortness of breath, Fatigue, Difficulty regulating body temperature, Lack of concentration, Pale skin, Dizziness, Irregular or rapid heartbeat.
The best way to prevent deficiency is to eat a balanced diet containing a combination of sources of iron.

A few animal based iron sources are:
Lean red meat, Chicken, Turkey, Oysters, Liver

Plant-based iron sources are:

Beans, lentils (legumes), Dark leafy greens, Tofu, Cashews, Fortified breakfast cereals, Whole grain and enriched breads

Similarly, in data analytics, deep knowledge of the data and how to use it is like iron and our health – a lack of it can lead to impaired function! A lack of this can lead to inaccurate, even misrepresented information, which can mislead decision-making. Here are some tips for you to ensure you have ample supply of iron in your data analytic work:

  • Understand the source of the data: primary real-world data is usually curated for a different intention than it is used later for secondary objectives. Understanding the originality of data will help you understand the advantages and limitations of such data.
  • Assess the quality of the data: conduct data quality check before starting analysis. Real world data is complex, and it often encompasses inconsistencies, inaccuracies, and incompleteness. It is always better to identify data quality issues early in the project before they surface in the downstream systems and processes.
  • Evaluate the applicability of the data: can the research question be answered within the scope of the data? Is there adequate information to appropriately define an analytic cohort and identify exposures, outcomes, covariates, and confounders? Data should be sufficiently granular, contain historical information to determine baseline covariates, and represent an adequate duration of follow-up.


Medical claims data is the type of healthcare records collected for billing purpose. It usually contains adequate information to substantiate the calculation of cost of health care. Since healthcare providers will want to be paid for the service they provide, it is usually possible to know whether a test has been ordered for a patient. However, to check the actual result of such test typically requires the researcher to dig into other data sources. EHR data stems from anything captured and stored in an electronic health record system. It is rich in clinical information but suffers from missing data issue as well as a lack of standardization. This is especially true concerning the current state of EMR records in the US, which lack an industrial wide standardization. The fact that so much of important information is stored in unstructured text format, it has become prone to human error. It is not uncommon to discover humanly impossible numbers in such datasets (i.e. BMI measurement close to, or more than 1000). To bridge the gap between research question and the raw data requires scrutiny and firm understanding of the data. After all, your conclusion can be only as good as what the data can provide. At KMK Consulting, our iron strong expertise in handling real world health data can help you to find answers from a variety of different real-world data.