Program Prospectus: Introduction to Data Analysis (2026)
Organizer: Institute for Integrated Development Studies (IIDS), Mandikhatar, Kathmandu
Resource Person: Dr. Uttam Sharma, Senior Research Fellow, IIDS
Timeline: January – March 2026 (8 Weeks)
- Program Overview & Objectives
This training is an intensive, eight-week journey into the fundamental concepts and practical applications of statistics in social science research. The goal is to move beyond theoretical understanding and empower participants to handle complex datasets and produce publication-ready analysis.
Upon completion, participants will be able to:
- Execute Data Management: Master the art of coding, labeling, scaling, and cleaning data.
- Apply Statistical Techniques: Move from descriptive summaries to complex multivariate models.
- Interpret & Communicate: Translate statistical outputs into compelling narratives for reports and scientific papers.
- Analyze with Precision: Match specific analytical techniques (like ANOVA or Logistic Regression) to appropriate data types.
- Participation Details
Prerequisites
- Education: A minimum of a Bachelor’s degree.
- Technical Skills: A foundational background in basic statistics and general computing knowledge.
- Advantage: A background in social science research is considered a significant asset.
Commitment & Software
- Time: At least 2 hours of in-person training per week, plus a commitment of at least 3 hours of independent study/practice at home per week.
- Software Policy: The primary instructional tool and hands-on exercises will be conducted using STATA. However, this course is software-inclusive; participants are welcome and encouraged to use other statistical software (such as R, SPSS, or Python) if they prefer.
- Detailed Course Structure
The curriculum is divided into five strategic parts designed to take a research project from conception to completion.
Part 1: Foundations & Data Preparation
- Research Design: Discussing areas of interest and identifying a specific research topic.
- Stata Orientation: Introduction to the interface and basic syntax.
- Data Cleaning: Techniques for importing data, labeling variables, constructing scales, and evaluating data structure and quality.
Part 2: Descriptive Statistics
- Summarization: Learning to explain data distributions through mean, median, mode, and variance.
- Reporting: Preparing professional-grade tables based on the trainees’ own data.
Part 3: Univariate & Bivariate Analysis
- Visualizing Insights: Effective use of graphical representations to communicate findings.
- Inferential Statistics: Introduction to random sampling, study design, and confidence intervals.
- Hypothesis Testing: Comparing groups and understanding p-values.
Part 4: Multivariate Techniques
This section focuses on the predictive power of statistics through regression modeling.
- Linear Regression:
- Simple and multiple linear regression models.
- Interpretation of regression estimates
- Logistic Regression:
- Understanding why linear models are sometimes inappropriate.
- Understanding Odds and Odds Ratios.
- Interpretation of categorical and continuous covariates.
Part 5: Integration & Writing
- Synthesis: Finalizing the “Putting it all together” phase.
- Reporting: Integrating the results of the analyses into the trainees’ final reports and scientific interpretations.
- The “Capstone” Research Paper
A hallmark of this training is its output-driven approach. Participants are not just students; they are researchers.
The Requirement: Every participant must select a research topic ahead of time that lends itself to either Multiple Linear Regression or Multiple Logistic Regression.
The Process: Throughout the eight weeks, you will use the data related to this topic for your exercises. You will be asked to document why your topic is important, ensuring that by the end of the course, you have a draft paper or report ready for review.
Proposed 8-Week Training Schedule: Introduction to Data Analysis
This schedule is designed to balance conceptual lectures with intensive hands-on practice. Note: While the instructor will demonstrate using STATA, participants are encouraged to follow along and execute these tasks in the software of their choice (R, SPSS, Python, etc.).
Week 1: Research Foundations & Data Architecture
- Lecture: Defining research questions; the logic of quantitative inquiry.
- Hands-on: Introduction to the software interface; data importation (CSV, Excel, etc.); creating log files/do-files.
- Assignment: Submit your research topic and a 3-bullet justification on “Why this topic matters.”
Week 2: Data Cleaning & Management
- Lecture: Data structures and types (string vs. numeric); the ethics of data quality.
- Hands-on: Coding and labeling variables; handling missing values; constructing scales (e.g., Likert scales) and generating new variables.
- Assignment: Perform a “Data Audit” on your personal dataset to ensure it is clean for analysis.
Week 3: Descriptive Statistics & Distribution
- Lecture: Measures of central tendency (mean, median, mode) and dispersion (standard deviation, variance).
- Hands-on: Summarizing variables; creating professional frequency tables and cross-tabulations.
- Assignment: Produce a “Table 1” (Demographic/Descriptive Profile) for your research paper.
Week 4: Univariate Analysis & Data Visualization
- Lecture: The visual language of data; understanding the “Shape” of data (skewness/kurtosis).
- Hands-on: Generating and customizing histograms, bar charts, and box plots.
- Assignment: Create two high-quality visualizations that highlight a key trend in your data.
Week 5: Bivariate Analysis & Group Comparisons
- Lecture: Random sampling, estimation, and the logic of hypothesis testing (p-values and confidence intervals).
- Hands-on: Running T-tests, Chi-square tests, and ANOVA to compare group means.
- Assignment: Test a hypothesis regarding the relationship between two variables in your dataset.
Week 6: Multivariate Analysis I – Linear Regression
- Lecture: Simple vs. Multiple Linear Regression; model assumptions and “Goodness of Fit.”
- Hands-on: Model building, checking for multicollinearity, and predicting outcomes.
- Assignment: Run your first regression model and interpret the coefficients.
Week 7: Multivariate Analysis II – Logistic Regression
- Lecture: Why Linear Regression fails for binary outcomes; Introduction to Logit theory and Odds Ratios.
- Hands-on: Running logistic regression with both categorical and continuous covariates.
- Assignment: Compare the results of a linear vs. logistic model on your binary outcome variable.
Week 8: Synthesis & Scientific Interpretation
- Lecture: How to write about data; translating “Statistical Significance” into “Policy Significance.”
- Hands-on: Finalizing the results section of the trainees’ reports; peer review of findings.
- Final Output: A draft research report/scientific paper based on the eight-week journey.