1. What is the difference between DBMS and RDBMS?
A database management system or DBMS is system software that can create, retrieve, update, and manage a database. It ensures the consistency of data and sees to it that it is organized and easily accessible by acting as an interface between the database and its end-users or application software. DBMS can be classified into four types:
- Hierarchical Database: It has a treelike structure with the data being stored in a hierarchical format. A parent in a hierarchical database can have multiple children, but a child can have only one parent.
- Network Database: This type of database is presented as a graph that can have many-to-many relationships, allowing entities to have multiple connections.
- Relational Database: It is the most widely used and easy-to-use database. It is represented as a table and the values in the columns and rows are related to each other.
- Object-Oriented Database: The data values and operations are stored as objects in this type of database, and these objects have multiple relationships among them.
An RDBMS stores data in the form of a collection of tables. The relations are defined between the common fields of these tables. MS SQL Server, MySQL, IBM DB2, Oracle, and Amazon Redshift are all based on RDBMS.
2. What is SQL?
SQL stands for Structured Query Language. It is the standard language for RDBMS and is useful in handling organized data with entities or variables with relations between them. SQL is used for communicating with databases.
According to ANSI, SQL is used for maintaining RDBMS and for performing different operations of data manipulation on different types of data by using the features of SQL. It is a database language that is used for the creation and deletion of databases. It can also be used, among other things, to fetch and modify the rows of a table.
3. What is normalization and what are its types?
Normalization is used to reduce data redundancy and dependency by organizing fields and tables in databases. It involves constructing tables and setting up relationships between those tables according to certain rules. The redundancy and inconsistent dependency can be removed using these rules to make normalization more flexible.
The different forms of normalization are as follows:
- First Normal Form: If every attribute in a relation is single-valued, then it is in the first normal form. If it contains a composite or multi-valued attribute, it violates the first normal form.
- Second Normal Form: A relation is said to be in the second normal form if it has met the conditions for the first normal form and does not have any partial dependency, i.e., it does not have a non-prime attribute that relies on any proper subset of any candidate key of the table. Often, the solution to this problem is to specify a single-column primary key.
- Third Normal Form: A relation is in the third normal form when it meets the conditions for the second normal form and there is not any transitive dependency between the non-prime attributes, i.e., all the non-prime attributes are decided only by the candidate keys of the relation and not by other non-prime attributes.
- Boyce-Codd Normal Form: A relation is in the Boyce-Codd normal form or BCNF if it meets the conditions of the third normal form, and for every functional dependency, the left-hand side is a super key. A relation is in BCNF if and only if X is a super key for every non-trivial functional dependency in form X –> Y.
4. What are Joins in SQL?
JOINS in SQL is used to combine rows from two or more tables based on a related column between them. Various types of JOINS can be used to retrieve data, depending on the relationship between tables.
There are four types of Joins:
- Inner Join
- Left Join
- Right Join
- Full Join
5. What are the applications of SQL?
The major applications of SQL are listed below:
- Writing data integration scripts
- Setting and running analytical queries
- Retrieving subsets of information within a database for analytics applications and transaction processing
- Adding, updating, and deleting rows and columns of data in a database
6. What is meant by table and field in SQL?
In a relational database, a table is a collection of data organized in rows and columns. Tables are used to store and manage data in a structured way, with each row representing a unique record and each column representing specific information about the record. Tables are often named according to the type of data they contain, such as “customers”, “orders”, or “employees”.
A field, also known as a column or an attribute, is a single information stored in a table. Each field is named and has a specific data type, such as text, number, date, or Boolean, that determines the type of data that can be stored in the field. For example, a “customers” table might have fields for the customer’s name, address, phone number, and email address.
Fields can also have other properties, such as a maximum length or a default value, that define how the data is stored and how it can be used. In addition, fields can be used to define relationships between tables by referencing the primary key of another table or by creating foreign keys that link related records across different tables.
7. What is a relational database?
A relational database is a type of database that organizes data into tables with rows and columns. It establishes relationships between tables using keys.
8. What is a primary key?
A primary key is a unique identifier for a record in a table. It ensures that each row in the table is uniquely identifiable.
9. What is the difference between a primary key and a unique key?
A primary key is used to uniquely identify a record in a table and cannot contain duplicate values. A unique key also enforces uniqueness but can allow null values.
10. What is the difference between UNION and UNION ALL in SQL?
UNION combines the result sets of two or more SELECT statements, removing duplicate rows. UNION ALL also combines result sets but retains all rows, including duplicates.
11. What is a stored procedure?
A stored procedure is a named set of SQL statements that are stored in the database and can be called and executed multiple times.
11. What is Denormalization?
DeNormalization is a technique used to access the data from higher to lower normal forms of database. It is also process of introducing redundancy into a table by incorporating data from the related tables.
12. What are all the different normalizations?
Database Normalization can be easily understood with the help of a case study. The normal forms can be divided into 6 forms, and they are explained below –
- First Normal Form (1NF):
This should remove all the duplicate columns from the table. Creation of tables for the related data and identification of unique columns.
- Second Normal Form (2NF):
Meeting all requirements of the first normal form. Placing the subsets of data in separate tables and Creation of relationships between the tables using primary keys.
- Third Normal Form (3NF):
This should meet all requirements of 2NF. Removing the columns which are not dependent on primary key constraints.
- Fourth Normal Form (4NF):
If no database table instance contains two or more, independent and multivalued data describing the relevant entity, then it is in 4th Normal Form.
- Fifth Normal Form (5NF):
A table is in 5th Normal Form only if it is in 4NF and it cannot be decomposed into any number of smaller tables without loss of data.
- Sixth Normal Form (6NF):
6th Normal Form is not standardized, yet however, it is being discussed by database experts for some time. Hopefully, we would have a clear & standardized definition for 6th Normal Form in the near future…
13. What is an Index?
An index is performance tuning method of allowing faster retrieval of records from the table. An index creates an entry for each value and it will be faster to retrieve data.
14. What are all the different types of indexes?
There are three types of indexes –
- Unique Index.
This indexing does not allow the field to have duplicate values if the column is unique indexed. Unique index can be applied automatically when primary key is defined.
- Clustered Index.
This type of index reorders the physical order of the table and search based on the key values. Each table can have only one clustered index.
- NonClustered Index.
NonClustered Index does not alter the physical order of the table and maintains logical order of data. Each table can have 999 nonclustered indexes.
15. What is a relationship and what are they?
Database Relationship is defined as the connection between the tables in a database. There are various data basing relationships, and they are as follows:
- One to One Relationship.
- One to Many Relationship.
- Many to One Relationship.
- Self-Referencing Relationship.
16. What is the difference between a LEFT JOIN and a RIGHT JOIN in SQL?
In SQL, a LEFT JOIN and a RIGHT JOIN are both types of outer join that can be used to combine data from two or more tables. The difference between them lies in which table’s data is preserved if there is no matching data in the other table.
- In a LEFT JOIN, all the rows from the table on the left-hand side of the JOIN keyword (the “left table”) are included in the result set, even if there is no matching data in the table on the right-hand side (the “right table”).
- In the Right JOIN, all the rows from the table on the right-hand side of the JOIN keyword (the “right table”) are included In the result set, even if there is no matching data in the left table.
In summary, the difference between a LEFT JOIN and a RIGHT JOIN is the table whose data is preserved when there is no match in the other table.
17. How would you retrieve all the records from a ” customers ” table in SQL?
To retrieve all the records from a table called “customers” in SQL, you would use the following query:
SELECT * FROM customers;
18. What is the difference between SQL’s WHERE and HAVING clauses?
In SQL, the WHERE clause is used to filter rows based on a condition on a column, while the HAVING clause is used to filter groups based on an aggregate function.
The WHERE clause is applied before any grouping takes place and filters individual rows based on a condition. On the other hand, the HAVING clause is applied after the grouping and filter groups based on the results of aggregate functions such as COUNT, SUM, AVG, etc.
19. What is the difference between a function and a stored procedure in SQL?
In SQL, a function returns a value, while a stored procedure does not necessarily return a value and may execute a series of operations or tasks. Functions can be used as part of a SQL statement or expression to return a value. In contrast, stored procedures can be used to encapsulate a series of SQL statements and can be executed as a single unit. Additionally, functions can be used within stored procedures, but stored procedures cannot be used within functions.
20. What is the purpose of an index in SQL, and how does it work?
An index is used to improve the performance of queries by allowing for faster data retrieval. It creates a separate data structure that stores the values of one or more columns and allows faster access to the data based on those values..
21. What is the difference between a clustered and a non-clustered index?
A clustered index determines the physical order of data in a table, whereas a non-clustered index creates a separate structure that points to the data.
22. What is the difference between DDL and DML statements in SQL?
DDL (Data Definition Language) statements are used to define and manage the structure of the database objects. DML (Data Manipulation Language) statements are used to retrieve, insert, update, and delete data.
23. What is a self-join?
A self-join is a join where a table is joined with itself. It is used when data in a table relates to other data in the same table.
24. What is the difference between CHAR and VARCHAR data types?
CHAR is a fixed-length character data type, while VARCHAR is a variable-length character data type. VARCHAR only uses the required storage space, while CHAR always uses the specified length.
25. What is a transaction in SQL?
A transaction is a sequence of SQL statements that are treated as a single unit. It ensures that all statements are executed successfully, or the changes are rolled back.
26. What is the purpose of the GROUP BY clause in SQL?
The GROUP BY clause is used to group rows based on one or more columns. It is often used in conjunction with aggregate functions to perform calculations on grouped data.
27. What is the purpose of the ORDER BY clause in SQL?
The ORDER BY clause is used to sort the result set in ascending or descending order based on one or more columns.
28. What is the difference between a primary key and a unique constraint?
A primary key is a combination of a unique constraint and a not-null constraint. It uniquely identifies a record, and no duplicate or null values are allowed.
28. What is the difference between a candidate key and a composite key?
A candidate key is a column or a set of columns that can uniquely identify a record. A composite key is a key that consists of two or more columns.
29. What is the purpose of the CASE statement in SQL?
The CASE statement is used to perform conditional logic in SQL. It allows the execution of different statements based on different conditions.
30. What is the purpose of the ISNULL function in SQL?
The ISNULL function is used to replace NULL values with a specified value. If the value is not NULL, it is returned as-is.
31. What is the purpose of the COALESCE function in SQL?
The COALESCE function is used to return the first non-NULL expression from a list of expressions. It helps handle NULL values effectively.
32. What is the purpose of the TRIGGER in SQL?
A trigger is a special type of stored procedure that automatically executes in response to specific database events, such as inserting, updating, or deleting records.
33. What is a stored procedure?
Stored Procedure is a function consists of many SQL statement to access the database system. Several SQL statements are consolidated into a stored procedure and execute them whenever and wherever required.
34. What is ETL in SQL?
ETL (Extract, Transform, Load) is a common process used in data warehousing and business intelligence to move data from various sources into a data warehouse or database. The process involves three steps:
- Extract: In this step, data is extracted from various sources such as databases, files, or web services. This may involve using SQL queries to extract data from databases, APIs, or web scraping tools to extract data from web services or files.
- Transform: Once the data has been extracted, it is transformed or cleaned to make it suitable for storage and analysis. This may involve applying filters, aggregating data, or converting data types. SQL is commonly used to transform data as part of the ETL process.
- Load: The final step is to load the transformed data into a data warehouse or database. This may involve loading data into tables, creating indexes, or performing other database operations.
The ETL process is critical for data integration, as it allows organizations to collect data from various sources, transform it into a consistent format, and store it in a central location for analysis and reporting. ETL tools such as Microsoft SQL Server Integration Services (SSIS) or Talend can automate much of the ETL process and provide a visual interface for designing and managing data flows.
35. What is the purpose of the MAX() function in SQL?
The MAX() function is used to retrieve the maximum value from a column in a table.
36. How to select unique records from a table?
Select unique records from a table by using DISTINCT keyword.
Select DISTINCT StudentID, StudentName from Student.
37. List different Types of Index in SQL?
In SQL, different indexes can be created to improve query performance. Here are some of the most common types of indexes:
- Clustered Index: A clustered index in SQL organizes and stores the data rows in a table based on the values of one or more columns. This index determines the physical order of the data within the table, making it highly efficient for range queries and sorting operations. By sorting and storing the data in the table based on the values of the clustered index, queries that filter or sort by those columns can be performed faster since they can utilize the physical order of the data.
- Non-Clustered Index: A non-clustered index creates a separate structure that stores a copy of the indexed columns and a pointer to the corresponding data row in the table. It allows for faster retrieval of specific rows or ranges of rows but can be less efficient than a clustered index for sorting operations.
- Unique Index: A unique index enforces the constraint that the values in the indexed column(s) must be unique across all rows in the table. Depending on the table’s primary key, it can be either a clustered or non-clustered index.
- Composite Index: A composite index is an index that is created on two or more columns in a table. It can improve the performance of queries that filter on multiple columns, allowing more efficient sorting and matching of the indexed values.
- Full-Text Index: A full-text index searches for text-based data in a table, such as articles, documents, or web pages. It allows for fast and efficient searching of large amounts of text using algorithms that analyze the data’s words, phrases, and context.
- Spatial Index: A spatial index is used to optimize the querying of geographic or location-based data in a table, such as maps, GPS coordinates, or boundaries. It uses specialized data structures and algorithms to store and search for spatial data efficiently.
38. What is the purpose of the RANK() function in SQL?
The RANK() function is used to assign a rank to each row within a result set based on a specified order.
39. What is the purpose of the DATE() function in SQL?
The DATE() function is used to extract the date part from a datetime or timestamp value.
40. What are the uses of SQL?
The following operations can be performed by using a SQL database:
- Creating new databases
- Inserting new data
- Deleting existing data
- Updating records
- Retrieving the data
- Creating and dropping tables
- Creating functions and views
- Converting data types
41. What are entities and relationships?
Entities: An entity can be a person, place, thing, or any identifiable object for which data can be stored in a database.
For example, in a company’s database, employees, projects, salaries, etc., can be referred to as entities.
Relationships: A relationship between entities can be referred to as a connection between two tables or entities.
For example, in a college database, the student entity and the department entity are associated with each other.
41. What is the difference between the RANK() and DENSE_RANK() functions?
The RANK() function in the result set defines the rank of each row within your ordered partition. If both rows have the same rank, the next number in the ranking will be the previous rank plus a number of duplicates. If we have three records at rank 4, for example, the next level indicated is 7.
The DENSE_RANK() function assigns a distinct rank to each row within a partition based on the provided column value, with no gaps. It always indicates a ranking in order of precedence. This function will assign the same rank to the two rows if they have the same rank, with the next rank being the next consecutive number. If we have three records at rank 4, for example, the next level indicated is 5.
42. What is schema in SQL Server?
A schema is a visual representation of the database that is logical. It builds and specifies the relationships among the database’s numerous entities. It refers to the several kinds of constraints that may be applied to a database. It also describes the various data kinds. It may also be used on Tables and Views.
Schemas come in a variety of shapes and sizes. Star schema and Snowflake schema are two of the most popular. The entities in a star schema are represented in a star form, whereas those in a snowflake schema are shown in a snowflake shape.
Any database architecture is built on the foundation of schemas.
43. NoSQL vs SQL
In summary, the following are the five major distinctions between SQL and NoSQL:
Relational databases are SQL, while non-relational databases are NoSQL.
SQL databases have a specified schema and employ structured query language. For unstructured data, NoSQL databases use dynamic schemas.
SQL databases scale vertically, but NoSQL databases scale horizontally.
NoSQL databases are document, key-value, graph, or wide-column stores, whereas SQL databases are table-based.
SQL databases excel in multi-row transactions, while NoSQL excels at unstructured data such as documents and JSON.
44. What is the difference between NOW() and CURRENT_DATE()?
NOW() returns a constant time that indicates the time at which the statement began to execute. (Within a stored function or trigger, NOW() returns the time at which the function or triggering statement began to execute.
The simple difference between NOW() and CURRENT_DATE() is that NOW() will fetch the current date and time both in format ‘YYYY-MM_DD HH:MM:SS’ while CURRENT_DATE() will fetch the date of the current day ‘YYYY-MM_DD’.
45. How to create a stored procedure using SQL Server?
A stored procedure is a piece of prepared SQL code that you can save and reuse again and over.
So, if you have a SQL query that you create frequently, save it as a stored procedure and then call it to run it.
You may also supply parameters to a stored procedure so that it can act based on the value(s) of the parameter(s) given.
Stored Procedure Syntax
CREATE PROCEDURE procedure_name
AS
sql_statement
GO;
Execute a Stored Procedure
EXEC procedure_name;
46. What is Database Black Box Testing?
Black Box Testing is a software testing approach that involves testing the functions of software applications without knowing the internal code structure, implementation details, or internal routes. Black Box Testing is a type of software testing that focuses on the input and output of software applications and is totally driven by software requirements and specifications. Behavioral testing is another name for it.
47. What is the difference between CHAR and VARCHAR2 datatype in SQL?
Both Char and Varchar2 are used for characters datatype but varchar2 is used for character strings of variable length whereas Char is used for strings of fixed length. For example, char(10) can only store 10 characters and will not be able to store a string of any other length whereas varchar2(10) can store any length i.e 6,8,2 in this variable.
48. What is the difference between DROP and TRUNCATE commands?
DROP command removes a table and it cannot be rolled back from the database whereas TRUNCATE command removes all the rows from the table.
49. What is SQL Injection?
SQL injection is a sort of flaw in website and web app code that allows attackers to take control of back-end processes and access, retrieve, and delete sensitive data stored in databases. In this approach, malicious SQL statements are entered into a database entry field, and the database becomes exposed to an attacker once they are executed. By utilising data-driven apps, this strategy is widely utilised to get access to sensitive data and execute administrative tasks on databases. SQLi attack is another name for it.
The following are some examples of SQL injection:
- Getting access to secret data in order to change a SQL query to acquire the desired results.
- UNION attacks are designed to steal data from several database tables.
- Examine the database to get information about the database’s version and structure
50. How many Aggregate functions are available in SQL?
SQL aggregate functions provide information about a database’s data. AVG, for example, returns the average of a database column’s values.
SQL provides seven (7) aggregate functions, which are given below:
AVG(): returns the average value from specified columns.
COUNT(): returns the number of table rows, including rows with null values.
MAX(): returns the largest value among the group.
MIN(): returns the smallest value among the group.
SUM(): returns the total summed values(non-null) of the specified column.
FIRST(): returns the first value of an expression.
LAST(): returns the last value of an expression.