In today’s data-driven world, companies like AbsolutData are at the forefront of harnessing the power of data analytics to drive business decisions and gain competitive advantages. As you prepare for your data analytics interview at AbsolutData, it’s essential to understand the types of questions you might encounter and how to approach them.
AbsolutData, a leading analytics and data science company, offers a dynamic environment for those passionate about turning raw data into actionable insights. To help you excel in your interview, let’s explore some common data analytics interview questions and their corresponding answers that you might encounter at AbsolutData.
Table of Contents
SQL related questions
Question: What is SQL?
Answer: SQL stands for Structured Query Language. It’s a standard programming language designed for managing data held in a relational database management system (RDBMS).
Question: What are the different types of SQL commands?
Answer: SQL commands can be broadly categorized into four types:
- Data Definition Language (DDL): Used to define the database structure.
- Data Manipulation Language (DML): Used to manipulate data in the database.
- Data Control Language (DCL): Used to control access to data within the database.
- Transaction Control Language (TCL): Used to manage transactions within the database.
Question: What is the difference between CHAR and VARCHAR data types?
Answer:
- CHAR is a fixed-length data type, while VARCHAR is a variable-length data type.
- CHAR will always reserve the specified length, padding the extra space with blanks.
- VARCHAR will only use as much space as needed, up to the specified maximum length.
Question: What is a primary key?
Answer: A primary key is a column or a set of columns that uniquely identifies each row in a table.
It ensures that each row in a table is uniquely identifiable.
Primary keys must contain unique values and cannot have NULL values.
Question: What is the difference between INNER JOIN and LEFT JOIN?
Answer:
INNER JOIN returns rows when there is at least one match in both tables.
LEFT JOIN returns all rows from the left table (table1) and the matched rows from the right table (table2).
If there is no match, NULL values are used.
Question: What is a foreign key?
Answer: A foreign key is a field (or collection of fields) in one table that uniquely identifies a row of another table.
It establishes a link between two tables, enforcing referential integrity.
The foreign key constraint prevents actions that would destroy links between tables.
Question: What is the difference between GROUP BY and DISTINCT?
Answer:
- GROUP BY is used to group rows that have the same values into summary rows, like “find the number of customers in each city”.
- DISTINCT is used to return unique values from a column, removing duplicates from the result set.
Question: Explain the ACID properties of a transaction.
Answer:
- Atomicity: A transaction is treated as a single “unit”, which either completes entirely or not at all.
- Consistency: The database must remain in a consistent state before and after the transaction.
- Isolation: Transactions should be isolated from each other until they are completed.
- Durability: Once a transaction has been committed, the changes made by it will be permanent, surviving system failures.
Question: How do you prevent SQL injection attacks?
Answer: SQL injection attacks can be prevented by using parameterized queries (prepared statements) or by using ORM (Object-Relational Mapping) libraries that automatically handle input sanitization.
Question: What is a view in SQL?
Answer: A view is a virtual table based on the result of a SELECT query. It does not store data on its own but rather displays data from other tables. Views can simplify complex queries and provide a layer of security by limiting access to certain columns or rows.
Question: What is a stored procedure in SQL?
Answer: A stored procedure is a prepared SQL code that can be saved and reused. It allows you to group and save a set of SQL statements which can then be called and executed multiple times. Stored procedures can accept input parameters and return multiple values in the form of output parameters or result sets.
Question: What is a transaction in SQL?
Answer: A transaction in SQL represents a single unit of work. It is a sequence of one or more SQL operations treated as a single logical unit of work. Transactions provide the ACID properties (Atomicity, Consistency, Isolation, Durability) to ensure data integrity.
Question: What is normalization in databases?
Answer: Normalization is the process of organizing the columns (attributes) and tables (relations) of a relational database to reduce redundancy and dependency. It involves dividing large tables into smaller tables and defining relationships between them.
Question: Explain the difference between TRUNCATE, DELETE, and DROP in SQL.
Answer:
- TRUNCATE: Removes all rows from a table without logging individual row deletions. It is faster than DELETE as it does not generate individual DELETE statements for each row, but it cannot be rolled back.
- DELETE: Removes one or more rows from a table based on the condition specified. It is slower than TRUNCATE, generates individual DELETE statements for each row, and can be rolled back.
- DROP: Deletes an entire table, including its structure, indexes, constraints, and data. It removes the table from the database entirely.
Other Technical Questions
Question: What is mean median and mode and their usage in practical life ?
Answer: Mean is the average of a set of numbers, found by adding all values and dividing by the count. It’s used for financial planning, education grades, sports scores, and survey data analysis.
Median is the middle value when data is ordered, crucial for income distribution, real estate prices, age demographics, and test scores.
Mode is the most frequent value in a dataset, helpful in business for popular products, weather analysis, medical research, and grade distributions.
These measures offer insights into central tendencies, aiding decision-making in finance, healthcare, education, and more. They help summarize data and understand the typical or central value of a dataset.
Question: What is normal distribution?
Answer: Normal Distribution, also known as Gaussian distribution, is a bell-shaped probability distribution that is symmetric around the mean. In a normal distribution:
The mean, median, and mode are all equal and located at the center of the distribution.
The curve is bell-shaped, with most of the data clustered around the mean.
The standard deviation determines the spread or width of the distribution.
About 68% of the data falls within one standard deviation from the mean (1σ), 95% within two standard deviations (2σ), and 99.7% within three standard deviations (3σ).
Question: Logic for swapping two variables without using a temporary variable.
Answer: You can swap two variables without using a temporary variable by utilizing arithmetic operations or bitwise XOR operation. Here are examples of both methods:
- Using Arithmetic Operations
a = 10
b = 20
a = a + b # a now holds the sum of the two numbers
b = a – b # b now holds the original value of ‘a’
a = a – b # a now holds the original value of ‘b’
print(“After swapping:”)
print(“a =”, a)
print(“b =”, b)
- Using Bitwise XOR (Exclusive OR):
a = 10
b = 20
a = a ^ b # a now holds the XOR of the two numbers
b = a ^ b # b now holds the original value of ‘a’
a = a ^ b # a now holds the original value of ‘b’
print(“After swapping:”)
print(“a =”, a)
print(“b =”, b)
Power BI
Question: What are the components of Power BI?
Answer: Power BI consists of several components:
- Power BI Desktop: A desktop application used to create reports and data models.
- Power BI Service (PowerBI.com): A cloud-based service for sharing, collaborating, and publishing reports and dashboards.
- Power BI Mobile: Mobile apps available for iOS, Android, and Windows devices for viewing and interacting with reports on the go.
- Power BI Gateway: Allows connecting on-premises data sources to Power BI Service.
Question: What is Power Query?
Answer: Power Query is a data connection technology that enables users to discover, connect, combine, and refine data across a wide variety of sources. It is used in Power BI to import and transform data before loading it into the data model.
Question: Explain the difference between calculated columns and measures in Power BI.
Calculated Columns: These are columns that are computed based on data in other columns of a table. They are static and computed during data refresh. They can be used for filtering, sorting, and grouping data.
Measures: Measures are calculations performed on the fly in response to user interactions, such as slicing and dicing data. They are dynamic and calculated at runtime when a user interacts with visuals. They are usually used for aggregations (like sums, averages) and calculations
Question: What are Power BI visuals and how can you customize them?
Answer: Power BI visuals are the graphical representations of data within reports and dashboards. They can be customized in various ways:
Changing colors, fonts, and styles.
Adding interactive features like drill-through, tooltips, and slicers.
Adjusting formatting options such as data labels, axes, and legends.
Using custom visuals from the Power BI marketplace for unique visualizations.
Question: How can you handle row-level security in Power BI?
Answer: Row-level security in Power BI allows you to restrict data access at the row level based on roles or user identities. This can be implemented by creating roles in Power BI Desktop and defining rules that filter data based on conditions such as user email or role membership.
Question: What is Power BI Embedded?
Answer: Power BI Embedded is a Microsoft Azure service that allows developers to embed fully interactive, stunning visuals and reports into custom applications without the need for users to have a Power BI license. It provides APIs for embedding Power BI reports, dashboards, and tiles into an application.
Question: How can you create a calculated table in Power BI?
Answer: To create a calculated table in Power BI, you can use the “New Table” feature in Power BI Desktop and define the table using DAX expressions. This allows you to create a table based on specific conditions or calculations.
Conclusion
AbsolutData offers an exciting opportunity for data enthusiasts to delve into the world of analytics, transforming data into valuable insights for clients across industries. By preparing for these key data analytics interview questions and understanding the methodologies and tools used in the field, you can showcase your expertise and passion for data-driven decision-making.
Remember to tailor your responses to your experiences, emphasizing problem-solving skills, analytical thinking, and the ability to communicate complex ideas effectively. Best of luck on your journey to becoming a valued member of the AbsolutData team!