SAS Interview Questions
Last updated on Jan 09, 2024
If you want to build your career in the domain of SAS, you are on the right platform. No matter if you are a fresher or experienced. HKR Trainings will help you in your preparation to crack SAS interviews. We provide you with the top 50 frequently asked SAS Interview Questions, dividing them 3 levels: Beginners level, Intermediate Level, Advanced level. These questions will help you to clear your SAS interview. Before we move to the SAS Interview questions, Let us understand What is SAS? SAS is a Statistical software suite used for data management, advanced analytics, business intelligence, multivariate analysis, predictive analytics and criminal investigation. It is easy to learn and the best option if you already know SQL. Now Let’s get started with SAS frequently asked Interview Questions Prepared by our HKR Trainings team.
Most Frequently Asked SAS Interview Questions
Now let's have a look into the SAS interview questions based on the levels.
- Basic SAS Interview Questions
- SAS Interview Questions for experienced
- Advanced SAS Interview Questions
Most Frequently Asked SAS Interview Questions
- What is SAS?
- Mention few capabilities of SAS Framework.
- Name some of the key concepts of SAS?
- What is PVD?
- What are the differences between PROC MEANS and PROC SUMMARY?
- What are the parameters of Scan function?
- What is the difference between DO WHILE and DO UNTIL?
- What is the function of the CATX function?
- Define the various ways to build Macro Variables.
- Explain Interleaving
Basic SAS Interview Questions
What is SAS? What are the functions it performs?
Ans: SAS means Statistical Analysis System. SAS is an integrated set of software products mainly used for Statistical data analysis and visualization.
Some of its functions are:
- Using SAS we can analyze, change, manipulate and retrieve data.
- It can perform numerical Analysis.
- We can write programs that analyze data and create reports with the help of several SAS tools.
- Using SAS, we can get quality data analytics.
What are the uses of SAS?
Ans: Although SAS is closely associated with business, economics and finance, time-series also arises in many other fields. Main uses of SAS procedures are economic and financial modeling, Economic analysis, forecasting, financial reporting, time series analysis and manipulation of time-series data. SAS is useful whenever the time dependencies simultaneous relationships or dynamic processes complicate data analysis.
What is the basic syntax style in a SAS program?
Ans: To run the program successfully, you need to follow these basic elements:
- Statements in every line must be ended with a semicolon
- Data statement that defines the data set
- Names of the variables in the data set are described by the InPUT statement.
- There must be space between the word and the statement.
EX: In file ‘E: \StatHW\yourfilename.dat;’
Describe the basic structure of a SAS program.
Ans: The SAS Program includes the following steps:
- DATA step: This step retrieves and operates the data.
- PROC step: This step helps to translate the data.
What are the data types in SAS?
Ans: In SAS we use Numeric and Character data types. Although there are implicit function to work on dates, Dates are also considered as characters.
What are the main components used in SAS programming?
Ans: The main components used in SAS programming are datasets, statements and variables.
How can you sort variables in SAS?
Ans: PROC SORT command is used to sort variables in SAS. It can sort variables even in the groups.
Mention few capabilities of SAS Framework.
Ans: SAS Framework has four Capabilities. They are: Access, Manage, Analyze and Present.
- SAS allows us to access data from multiple sources like raw database, oracle database, SAS datasets and Excel files.
- Then we can manage this data to subset data, create variables, validate and clean data.
- We can analyze this data, as SAS main function was statistical analysis. We can perform simple analysis like averages and frequency, and complex analysis like forecasting and regression.
- Finally we can present this in the form of a summary, list or graphic reports. These reports can be either printed or written in a data file or published online.
Why should you choose SAS over other data analytics tools?
Ans: Following are the reason for why SAS is used over other data analytics tools:
- SAS is very easy to learn. For the people who already Know SQL, it provides the best option (PROC SQL).
- When it comes to handling huge amounts of data, SAS is on top of all the leading tools including R and Python.
- SAS provides functional graphical capabilities. With a little bit of learning, it is possible to customize these plots.
- SAS updates are released in a controlled environment as they are well tested. On the other hand, R and Python have an open contribution and there are chances of errors in their latest versions.
- In available corporate jobs, SAS is the market leader across the world. SAS controls 70% of the data analytics market share in India.
) Name some of the key concepts of SAS?
Ans: Some of the key concepts of SAS are:
- SORT Procedure
- Data step logic
- Log
- Data types
- Missing values
- DROP=, KEEP=, IN= dataset options
- Reset to missing or RETAIN statement
- FORMAT procedures for creating value formats
)List some common programming errors committed in SAS?
Ans: Some of the commonly committed programming errors are:
- A missing semicolon
- Not using the debugging techniques
- Not checking the log after submitting the program.
- Not using the Fsview option vigorously.
) Can you explain the PUT function and INPUT function in SAS?
Ans:
- INPUT function - Using this Character to numeric conversion is done.
INPUT(source, INFORMAT) - PUT function - Using this Numeric to character conversion is done.
PUT(source, FORMAT)
) What are the special Input Delimiters used in SAS?
Ans: The special delimiters used in SAS are DSD and DLM.
) What is SAS informat?
Ans: To read various data values, SAS uses the command “Informat,” which helps to input or read data from outside files such as FLAT, ASCII, and text files.
)What is PVD?
Ans: PVD means Program Data Vector. It is the memory area where datasets are created through the SAS system, one at a time. An Input buffer is created when a program is executed. This Input buffer reads the data values and makes them assigned to their respective variables.
)What is PROC in SAS?
Ans: PROC in SAS analyzes and processes data in the form of SAS data set. It controls a library of routines that perform SAS data set options such as Listing, sorting and summariziNg.
SAS Training
- Master Your Craft
- Lifetime LMS & Faculty Access
- 24/7 online expert support
- Real-world & Project Based Learning
) What is the difference between INPUT and INFILE?
INPUT - It is used to describe your variables.
INFILE - It is used to identify an external file.
)How to limit places for the variables using PROC MEANS?
Ans: With the help of MAXDEC= option we can limit decimal places for the variable.
) What is a SAS data set?
Ans: A SAS data set is a file consisting of two parts: A descriptor portion and a data portion.
) Mention the category in which SAS Informats are placed?
Ans: There are three categories in which SAS informats are placed. They are:
- Date/Time Informats: INFORMAT w.
- Numeric Informats: INFORMAT w.d
- Character Informats: $INFORMATw
) How is data accessed from an external data file in SAS?
Ans: To input or read the data from external files known as Flat files, text files, ASCII files or sequential files, SAS INFORMATS are used. They tell SAS on how to read data into SAS variables.
) How is data accessed from an existing dataset in SAS?
Ans: SET statement is used to access data from the existing dataset. From one observation to another, the value of variables will be retained.
) Difference between FORMAT and INFORMAT.
Ans: To indicate how to print the variables, we use FORMAT.
To indicate that a number should be read in a particular format, we use INFORMAT.
) What is the function of the output statement in a SAS Program?
Ans: OUTPUT statement is used to write the current observation to the SAS data set in the data statement. If there are more than one dataset in the data statement, then you can use the OUTPUT statement to tell SAS which dataset to write in which data statement. Within the data step, if there is no OUTPUT statement, SAS writes the observation when it encounters by default.
) What is the difference between SAS functions and procedures?
Ans: Difference between SAS functions and Procedures is that Functions expect values to be supplied across an observation in SAS data while Procedures expect one variable value for every observation.
) What are the differences between PROC MEANS and PROC SUMMARY?
Ans:
PROC MEANS | PROC SUMMARY |
PROC MEANS procedures subgroup statistics only when a BY statement is used and the input data has been sorted previously by the By variables. |
PROC SUMMARY produces statistics for all subgroups automatically, giving all the information by repeatedly sorting a data set by the variables that define each subgroup and running PROC MEANS. PROC SUMMARY does not produce any information in the output. So we need to use the OUTPUT statement to create a new DATA SET and PROC PRINT to see the computed statistics. |
PROC MEANS takes all the numerical variables in the analysis by default. | PROC SUMMARY takes the variables that are described in the VAR statement into the statistical analysis. |
Subscribe to our YouTube channel to get new updates..!
) How to remove duplicates using PROC SQL?
Ans:
There are three ways to delete the duplicates using PROC SQL.
- By using nodups in the procedures
- By cleaning the data
- By using SQL Query
)What is DATA Step?
Ans. In SAS, the DATA Step refers to the function that helps to build SAS datasets by operating data. Also, it includes the following automatically built variables: _N_ and _ERROR_.
)What is SLIBREF?
Ans: SLIBREF means server-libref. It specifies the libref which is used by the server to identify the SAS library data when no physical name is determined.
) Why double trailing @@is used in Input Statements?
Ans: Including double trailing @@ in Input statements during data step iteration implies that SAS should hold the current record to execute the next Input statement rather than switching onto the new record.
SAS Interview Questions for experienced
) What are the parameters of Scan function?
Ans: Scan function is written in the following way:
scan(argument, n, delimiters)
Here argument indicates the character variable or expression, ‘n’ indicates which word to read and delimiters are the special characters that are to be enclosed in single quotes.
) Differences between NODUP and NODUPKEY.
Ans: NODUP just compares the BY variable present in the dataset. It checks for observations with duplicate BY values and eliminates them.
NODUPKEY compares all the variables present in the dataset. It considers all the observations. The sort Procedure compares the current observation with the previous observation, when NODUP is specified. If the observations match all the variables, the current observation is left out of the output data set.
) Difference between PROC CONTENTS and PROC PRINT?
Ans: PROC PRINT is used to list the values of the variables in a SAS data set and ensure that data was read correctly into SAS.
PROC CONTENTS displays information about the SAS dataset.
) Differentiate CEIL and FLOOR.
Ans: CEIL returns the smallest integer greater than/equal to the argument whereas FLOOR returns the greatest integer less than/equal to the argument.
)What is the length assigned to the target variable by the SCAN function?
Ans: Using the default and specific delimiters, SCAN function returns a given word from a character string. The target variable length assigned by the scan function is 200.
Advanced SAS Interview Questions
) What is the command used to find the missing values?
Ans: Command used to find the missing values is
missing_values=MISSING(field1,field2,field3);
)How to create multiple observations from a single observation?
Ans: Using double trailing @@ we can create multiple observations from a single observation.
)What is the use of %include statement?
Ans: %INCLUDE statement reads the entire file into the current SAS program we are running and submits the file to the SAS system immediately.
)What are the scrubbing procedures in SAS?
Ans: The scrubbing procedures in SAS are NODUPKEY and Proc Sort option. It eliminates the duplicate values.
) What is the difference between DO WHILE and DO UNTIL?
Ans: Difference between DO UNTIL and DO WHILE statements is that DO WHILE expression is evaluated at the top of DO loop. DO loop is executed for the first time, if the expression is false, then it will never execute. But DO UNTIL executes at least once.
)What is the use of PROC GPLOT?
Ans: It identifies the data set that contains the plot variables. PROC GPLOT has more options so it can create fancier and colourful graphics.
) What is the Significance of TRANWRD function in SAS?
Ans: In SAS, TRANWRD function is used to remove the repeated occurrence of pattern of characters within a character string.
) How can you merge or combine datasets in SAS?
Ans:
Using the following Syntax we can merge or combine datasets in SAS.
Data combined;
Merge data1 data2;
run;
)How to create a permanent data set?
Ans: To create a permanent data set, two steps are involved.
- First Assign a library and engine and Create the data.
- Then Assign a library and a dataset name to make the data set permanent.
) Which is more suitable to combine datasets either one-to-one merge or match-merge options?
Ans: one-to-one merge is more suitable to combine datasets than a match-merge option.
) What is DATA_NULL_?
Ans:
DATA_NULL_ is a data step that does not create any data set. It is used for creating macro variables and writing the output without any data set.
) What is the purpose of the RETAIN statement?
Ans: The purpose of the RETAIN statement is to keep the value once assigned. In a SAS Program, if it is required to move from the current iteration to the next of the data step, then the RETAIN statement tells SAS to retain the values rather than setting them to missing.
) Difference between SYMPUT and SYMGET?
Ans: SYMPUT is used to store the data set values into the macro variable, while SYMGET is used to retrieve the value from the macro to the data set.
)What is the command used to save logs in the external file?
Ans:
To save logs in the external file, PROC PRINTTO command is used.
PROC PRINTTO log="C:\Users\MyFolder\Downloads\LOG.txt" new;
run;
) Which function is used to count the number of intervals between two SAS dates?
Ans:
To count the number of intervals between two SAS dates, the interval function INTCK is used.
INTCK(interval,start-of-period,end-of-period)
) What is the function of the CATX function?
Ans: The CATX function in SAS places delimiters, deletes dragging and key blanks, and gives back a concatenated text string.
) Define SYSRC function uses.
Ans: The function SYSRC offers a system error digit.
) What is the process for copying the whole library?
Ans: You need to follow a copy statement along with the input and output data library to copy the whole library.
) Define _N_ and _ERROR_ variables in SAS.
Ans: SAS Data Step includes two automatically built variables such as _N_ and _ERROR_. The _N_ variable observes the number of times that the Data Step repeats and contains the default value 1. But with the repetition of the Data Step, this number will increase. The variable _ERROR_ helps to identify different types of errors, such as mathematical errors, conversion errors, etc. It contains a default value of 0, but it is set to 1 whenever an error occurs.
) List out the various SAS Procedures and Functions.
Ans: The following are the various SAS Functions:-
Scan, Substr, Trim, Compress, CATX, Index, County, etc.
Below are the various SAS procedures:-
- Proc Mean
- Proc Sort
- Proc SQL
- Proc Freq
- Proc Report
- Rank
- Copy
- Append
) Define the various ways to build Macro Variables.
Ans: The following are the different ways to build Macro variables in SAS.
- %LET
- Call Symput
- Macro Parameters
- Proc SQL INTO Clause
- Iterative %DO Loop
- %Global
) Define the various SAS system options useful to debug SAS Macro Programs.
Ans. The following are the various SAS system options available to debug or troubleshoot SAS Macros programs:-
- MLOGIC- This option helps to trace and present micrologic.
- IMPRINT- This option helps test the SAS statements produced.
- MEMRPT - This SAS option presents the stats of memory usage.
- SYMBLOGEN - It presents the outcomes of resolving macro variables.
) Can a SAS variable that comprises characters or letters be a numeric data type?
Ans. It can be a character data type instead of a numeric data type.
) Explain Interleaving.
Ans: Interleaving is a SAS feature that integrates several sorted SAS data sets into a single grouped data set. SET and BY statements are used in a DATA step to interleave data sets. The total number of observations across both original and new data sets adds to the number of observations within the new data set.
) What is meant by the BOR function in SAS?
Ans: The function BOR is SAS, a logical operation that computes bitwise logical OR of the two integers. In other words, giving back bitwise logical OR between the two arguments is helpful.
) How to create a compressed data set in SAS and its benefits?
Ans: To build a compressed data set in SAS, we will choose the COMPRESS= option as a product Data set option or within the statement of Options. If we compress any data set, it will help to minimize its size by minimizing the repetitive characters. It also helps to reduce the vast storage requirements for the data set.
Conclusion:
All the above mentioned questions are the frequently asked SAS interview Questions provided by the Industry Experts. They will help you to crack your SAS interview. However, if you find that any of the Questions is not covered, you are welcome to drop a message in the comments. We will revert to you with the answer.
Upcoming SAS Training Online classes
Batch starts on 31st Jan 2025 |
|
||
Batch starts on 4th Feb 2025 |
|
||
Batch starts on 8th Feb 2025 |
|