Interview - Blog | Minova - ATS Resume Builder

Introduction

Data analysts transform raw data into actionable insights that drive business decisions. Junior data analysts are expected to have strong skills in Excel, SQL, data visualization tools, and basic statistics to support data-driven decision making.

This guide covers essential interview questions for Junior Data Analysts. We explore Excel functions, SQL queries, data visualization with Tableau and Power BI, statistics fundamentals, and data analysis best practices to help you prepare for your first data analyst role.

Excel Fundamentals (5 Questions)

1. What are the most important Excel functions for data analysis?

Answer: Essential Excel functions every data analyst should know:

VLOOKUP/XLOOKUP: Lookup values in tables
IF/IFS: Conditional logic
SUMIF/SUMIFS: Conditional summation
COUNTIF/COUNTIFS: Conditional counting
PIVOT TABLES: Summarize and analyze data
INDEX/MATCH: More flexible than VLOOKUP
TEXT functions: LEFT, RIGHT, MID, CONCATENATE
DATE functions: TODAY, DATEDIF, EOMONTH

# VLOOKUP example
=VLOOKUP(A2, Products!A:C, 3, FALSE)

# SUMIFS example (sum sales where region="East" and product="Widget")
=SUMIFS(Sales!C:C, Sales!A:A, "East", Sales!B:B, "Widget")

# INDEX/MATCH example (more flexible than VLOOKUP)
=INDEX(Prices!C:C, MATCH(A2, Prices!A:A, 0))

# Conditional formatting with IF
=IF(B2>1000, "High", IF(B2>500, "Medium", "Low"))

Rarity: Very Common Difficulty: Easy

2. Explain the difference between VLOOKUP and INDEX/MATCH.

Answer:

VLOOKUP:
- Simpler syntax
- Only looks to the right
- Less flexible
- Slower for large datasets
INDEX/MATCH:
- More complex syntax
- Can look left or right
- More flexible
- Faster performance
- Can return entire rows/columns

# VLOOKUP - lookup value in column A, return from column C
=VLOOKUP(A2, A:C, 3, FALSE)

# INDEX/MATCH - equivalent but more flexible
=INDEX(C:C, MATCH(A2, A:A, 0))

# INDEX/MATCH advantage: can look left
=INDEX(A:A, MATCH(C2, C:C, 0))  # VLOOKUP can't do this

Rarity: Very Common Difficulty: Easy

3. How do you create and use Pivot Tables?

Answer: Pivot Tables summarize large datasets quickly.

Steps:
1. Select data range
2. Insert → Pivot Table
3. Drag fields to Rows, Columns, Values
4. Apply filters and formatting
Use Cases: Summarize sales by region, analyze trends, create reports

# Pivot Table structure:
Rows: Product Category
Columns: Quarter
Values: Sum of Sales
Filters: Region

# Calculated fields in Pivot Tables
Profit Margin = (Revenue - Cost) / Revenue

# Grouping dates
Right-click date → Group → Select Months/Quarters/Years

Rarity: Very Common Difficulty: Easy

4. What is conditional formatting and when would you use it?

Answer: Conditional formatting applies visual formatting based on cell values.

Use Cases:
- Highlight top/bottom values
- Show data bars or color scales
- Identify duplicates
- Flag outliers
- Create heat maps

# Highlight cells greater than 1000
Select range → Conditional Formatting → Highlight Cells Rules → Greater Than

# Color scale (gradient)
Select range → Conditional Formatting → Color Scales

# Data bars
Select range → Conditional Formatting → Data Bars

# Custom formula
=AND($B2>1000, $C2="Active")

Rarity: Common Difficulty: Easy

5. How do you remove duplicates and handle missing data in Excel?

Answer: Data cleaning is essential for accurate analysis.

# Remove duplicates
Data tab → Remove Duplicates → Select columns

# Find duplicates with conditional formatting
Select range → Conditional Formatting → Highlight Cells Rules → Duplicate Values

# Handle missing data
# Option 1: Filter and delete
Filter column → Uncheck (Blanks) → Delete visible rows

# Option 2: Fill with average
=IF(ISBLANK(A2), AVERAGE(A:A), A2)

# Option 3: Fill down
Select range → Ctrl+D (Windows) or Cmd+D (Mac)

# Find and replace blanks
Ctrl+H → Find what: (leave blank) → Replace with: 0 or N/A

Rarity: Very Common Difficulty: Easy

SQL Basics (5 Questions)

6. Write a SQL query to select all columns from a table.

Answer: Basic SELECT statement retrieves data from tables.

-- Select all columns
SELECT * FROM employees;

-- Select specific columns
SELECT first_name, last_name, salary
FROM employees;

-- Select with alias
SELECT 
    first_name AS "First Name",
    last_name AS "Last Name",
    salary AS "Annual Salary"
FROM employees;

-- Select distinct values
SELECT DISTINCT department
FROM employees;

-- Limit results
SELECT * FROM employees
LIMIT 10;

Rarity: Very Common Difficulty: Easy

7. How do you filter data using WHERE clause?

Answer: WHERE clause filters rows based on conditions.

-- Simple condition
SELECT * FROM employees
WHERE salary > 50000;

-- Multiple conditions with AND
SELECT * FROM employees
WHERE department = 'Sales' AND salary > 60000;

-- Multiple conditions with OR
SELECT * FROM employees
WHERE department = 'Sales' OR department = 'Marketing';

-- IN operator
SELECT * FROM employees
WHERE department IN ('Sales', 'Marketing', 'IT');

-- BETWEEN operator
SELECT * FROM employees
WHERE salary BETWEEN 50000 AND 80000;

-- LIKE operator (pattern matching)
SELECT * FROM employees
WHERE first_name LIKE 'J%';  -- Starts with J

-- IS NULL
SELECT * FROM employees
WHERE manager_id IS NULL;

-- NOT operator
SELECT * FROM employees
WHERE department NOT IN ('HR', 'Finance');

Rarity: Very Common Difficulty: Easy

8. Explain JOIN operations and their types.

Answer: JOINs combine data from multiple tables.

INNER JOIN: Returns matching rows from both tables
LEFT JOIN: Returns all rows from left table, matching from right
RIGHT JOIN: Returns all rows from right table, matching from left
FULL OUTER JOIN: Returns all rows from both tables

-- INNER JOIN
SELECT e.first_name, e.last_name, d.department_name
FROM employees e
INNER JOIN departments d ON e.department_id = d.department_id;

-- LEFT JOIN (all employees, even without department)
SELECT e.first_name, e.last_name, d.department_name
FROM employees e
LEFT JOIN departments d ON e.department_id = d.department_id;

-- Multiple joins
SELECT 
    e.first_name,
    d.department_name,
    l.city
FROM employees e
INNER JOIN departments d ON e.department_id = d.department_id
INNER JOIN locations l ON d.location_id = l.location_id;

-- Self join (employees and their managers)
SELECT 
    e.first_name AS employee,
    m.first_name AS manager
FROM employees e
LEFT JOIN employees m ON e.manager_id = m.employee_id;

Rarity: Very Common Difficulty: Medium

9. How do you use GROUP BY and aggregate functions?

Answer: GROUP BY groups rows and aggregate functions summarize data.

-- Count employees by department
SELECT department, COUNT(*) AS employee_count
FROM employees
GROUP BY department;

-- Average salary by department
SELECT 
    department,
    AVG(salary) AS avg_salary,
    MIN(salary) AS min_salary,
    MAX(salary) AS max_salary
FROM employees
GROUP BY department;

-- HAVING clause (filter groups)
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 60000;

-- Multiple grouping columns
SELECT 
    department,
    job_title,
    COUNT(*) AS count,
    AVG(salary) AS avg_salary
FROM employees
GROUP BY department, job_title
ORDER BY department, avg_salary DESC;

-- Common aggregate functions
-- COUNT, SUM, AVG, MIN, MAX, COUNT(DISTINCT)

Rarity: Very Common Difficulty: Medium

10. What is the difference between WHERE and HAVING?

Answer:

WHERE: Filters rows before grouping
HAVING: Filters groups after grouping
WHERE: Cannot use aggregate functions
HAVING: Can use aggregate functions

-- WHERE - filter before grouping
SELECT department, AVG(salary) AS avg_salary
FROM employees
WHERE salary > 40000  -- Filter individual rows
GROUP BY department;

-- HAVING - filter after grouping
SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department
HAVING AVG(salary) > 60000;  -- Filter groups

-- Both together
SELECT department, AVG(salary) AS avg_salary
FROM employees
WHERE hire_date > '2020-01-01'  -- Filter rows first
GROUP BY department
HAVING COUNT(*) > 5;  -- Then filter groups

-- This would be WRONG:
-- SELECT department FROM employees
-- WHERE COUNT(*) > 5;  -- Error: can't use aggregate in WHERE

Rarity: Very Common Difficulty: Easy

Data Visualization (4 Questions)

11. What are the key principles of effective data visualization?

Answer: Good visualizations communicate insights clearly.

Principles:
- Choose the right chart type (bar for comparison, line for trends, pie for parts of whole)
- Keep it simple (avoid clutter)
- Use appropriate colors (consistent, accessible)
- Label clearly (titles, axes, legends)
- Tell a story (highlight key insights)
- Consider audience (technical vs non-technical)

Loading diagram...

Rarity: Common Difficulty: Easy

12. When would you use a bar chart vs a line chart?

Answer: Different chart types serve different purposes:

Bar Chart:
- Compare categories
- Discrete data
- Examples: Sales by region, product comparison
Line Chart:
- Show trends over time
- Continuous data
- Examples: Monthly revenue, stock prices
Other Charts:
- Pie Chart: Parts of a whole (use sparingly)
- Scatter Plot: Relationship between two variables
- Histogram: Distribution of continuous data

Rarity: Common Difficulty: Easy

13. What is Tableau and what are its key features?

Answer: Tableau is a leading data visualization and business intelligence tool.

Key Features:
- Drag-and-drop interface (no coding required)
- Connect to multiple data sources (databases, Excel, cloud)
- Interactive dashboards
- Real-time data updates
- Calculated fields and parameters
- Sharing and collaboration
Common Tasks:
- Create worksheets (individual visualizations)
- Build dashboards (multiple visualizations)
- Apply filters and parameters
- Create calculated fields
- Publish to Tableau Server/Online

Rarity: Very Common Difficulty: Easy

14. What is the difference between Tableau and Power BI?

Answer: Both are popular BI tools with different strengths:

Tableau:
- Better for complex visualizations
- More intuitive for beginners
- Stronger data visualization capabilities
- More expensive
Power BI:
- Better Microsoft integration
- More affordable
- Strong for Excel users
- Better for data modeling
- DAX for calculations
Choice depends on:
- Budget
- Existing tech stack
- Complexity of visualizations
- Team skills

Rarity: Common Difficulty: Easy

Statistics & Analysis (4 Questions)

15. What measures of central tendency do you know?

Answer: Measures of central tendency describe the center of a dataset:

Mean: Average (sum / count)
- Sensitive to outliers
- Use for normally distributed data
Median: Middle value when sorted
- Robust to outliers
- Use for skewed data
Mode: Most frequent value
- Use for categorical data

import numpy as np

data = [1, 2, 2, 3, 4, 5, 100]

mean = np.mean(data)  # 16.71 (affected by outlier 100)
median = np.median(data)  # 3 (not affected by outlier)
# mode = 2 (most frequent)

print(f"Mean: {mean}")
print(f"Median: {median}")

Rarity: Very Common Difficulty: Easy

16. How do you identify outliers in a dataset?

Answer: Outliers are data points that differ significantly from other observations.

Methods:
- Visual: Box plots, scatter plots
- Statistical:
  - IQR method (1.5 × IQR beyond Q1/Q3)
  - Z-score (|z| > 3)
  - Standard deviation (beyond 2-3 std devs)

import numpy as np

data = np.array([10, 12, 13, 12, 11, 14, 13, 15, 100, 12])

# IQR method
Q1 = np.percentile(data, 25)
Q3 = np.percentile(data, 75)
IQR = Q3 - Q1

lower_bound = Q1 - 1.5 * IQR
upper_bound = Q3 + 1.5 * IQR

outliers = data[(data < lower_bound) | (data > upper_bound)]
print(f"Outliers: {outliers}")  # [100]

# Z-score method
mean = np.mean(data)
std = np.std(data)
z_scores = np.abs((data - mean) / std)
outliers_z = data[z_scores > 3]
print(f"Outliers (Z-score): {outliers_z}")

Rarity: Common Difficulty: Medium

17. What is the difference between correlation and causation?

Answer:

Correlation: Statistical relationship between variables
- Measured by correlation coefficient (-1 to 1)
- Doesn't imply causation
Causation: One variable directly causes changes in another
- Requires controlled experiments
- Correlation is necessary but not sufficient

Examples:

Ice cream sales and drowning deaths are correlated (both increase in summer)
But ice cream doesn't cause drowning (confounding variable: temperature)

-- Calculate correlation in SQL (simplified)
SELECT 
    CORR(sales, temperature) AS correlation
FROM daily_data;

-- Positive correlation: both increase together
-- Negative correlation: one increases, other decreases
-- Zero correlation: no relationship

Rarity: Very Common Difficulty: Easy

18. How do you calculate percentage change?

Answer: Percentage change measures relative change between two values.

# Formula
Percentage Change = ((New Value - Old Value) / Old Value) × 100

# Excel formula
=(B2-A2)/A2*100

# Example:
Old Value: 100
New Value: 120
Change: (120-100)/100 = 0.20 = 20% increase

# Year-over-Year growth
=(Sales_2023 - Sales_2022) / Sales_2022 * 100

-- SQL percentage change
SELECT 
    year,
    revenue,
    LAG(revenue) OVER (ORDER BY year) AS prev_year_revenue,
    ((revenue - LAG(revenue) OVER (ORDER BY year)) / 
     LAG(revenue) OVER (ORDER BY year) * 100) AS pct_change
FROM annual_sales;

Rarity: Very Common Difficulty: Easy

Business Intelligence & Reporting (2 Questions)

19. What is a KPI and how do you choose the right ones?

Answer: KPI (Key Performance Indicator) is a measurable value that shows how effectively objectives are being achieved.

Characteristics of good KPIs:
- Specific: Clear and well-defined
- Measurable: Quantifiable
- Achievable: Realistic
- Relevant: Aligned with business goals
- Time-bound: Has a timeframe
Examples:
- Sales: Monthly revenue, conversion rate
- Marketing: Customer acquisition cost, ROI
- Operations: Order fulfillment time, error rate
- Customer: Satisfaction score, retention rate

Rarity: Common Difficulty: Easy

20. How do you present data insights to non-technical stakeholders?

Answer: Effective communication is crucial for data analysts.

Best Practices:
- Start with the conclusion (what they need to know)
- Use simple visualizations (avoid complex charts)
- Tell a story (context, insight, recommendation)
- Avoid jargon (explain technical terms)
- Focus on business impact (revenue, cost, efficiency)
- Provide actionable recommendations
- Be prepared for questions
Structure:
1. Executive summary
2. Key findings
3. Supporting data/visualizations
4. Recommendations
5. Next steps

Rarity: Common Difficulty: Medium

Table of Contents

Junior Data Analyst Interview Questions: Complete Guide

Introduction

Excel Fundamentals (5 Questions)

1. What are the most important Excel functions for data analysis?

2. Explain the difference between VLOOKUP and INDEX/MATCH.

3. How do you create and use Pivot Tables?

4. What is conditional formatting and when would you use it?

5. How do you remove duplicates and handle missing data in Excel?

SQL Basics (5 Questions)

6. Write a SQL query to select all columns from a table.

7. How do you filter data using WHERE clause?

8. Explain JOIN operations and their types.

9. How do you use GROUP BY and aggregate functions?

10. What is the difference between WHERE and HAVING?

Data Visualization (4 Questions)

11. What are the key principles of effective data visualization?

12. When would you use a bar chart vs a line chart?

13. What is Tableau and what are its key features?

14. What is the difference between Tableau and Power BI?

Statistics & Analysis (4 Questions)

15. What measures of central tendency do you know?

16. How do you identify outliers in a dataset?

17. What is the difference between correlation and causation?

18. How do you calculate percentage change?

Business Intelligence & Reporting (2 Questions)

19. What is a KPI and how do you choose the right ones?

20. How do you present data insights to non-technical stakeholders?

Your Next Interview is Just One Resume Away

Share this post

Weekly career tips that actually work

Weekly career tips that actually work

Related Posts

Junior Data Scientist Interview Questions: Complete Guide

Junior Mobile Developer (iOS) Interview Questions: Complete Guide

Junior Mobile Developer (Android) Interview Questions: Complete Guide

Your Next Interview is Just One Resume Away

Share this post

Cut Your Resume Writing Time by 90%