Last updated on Nov 07, 2023
If you want to build your career in the domain of SAS, you are on the right platform. No matter if you are a fresher or experienced. HKR Trainings will help you in your preparation to crack SAS interviews. We provide you with the top 50 frequently asked SAS Interview Questions, dividing them 3 levels: Beginners level, Intermediate Level, Advanced level. These questions will help you to clear your SAS interview. Before we move to the SAS Interview questions, Let us understand What is SAS? SAS is a Statistical software suite used for data management, advanced analytics, business intelligence, multivariate analysis, predictive analytics and criminal investigation. It is easy to learn and the best option if you already know SQL. Now Let’s get started with SAS frequently asked Interview Questions Prepared by our HKR Trainings team.
Now let's have a look into the SAS interview questions based on the levels.
Ans: SAS means Statistical Analysis System. SAS is an integrated set of software products mainly used for Statistical data analysis and visualization.
Some of its functions are:
Ans: Although SAS is closely associated with business, economics and finance, time-series also arises in many other fields. Main uses of SAS procedures are economic and financial modeling, Economic analysis, forecasting, financial reporting, time series analysis and manipulation of time-series data. SAS is useful whenever the time dependencies simultaneous relationships or dynamic processes complicate data analysis.
Ans: To run the program successfully, you need to follow these basic elements:
Ans: The SAS Program includes the following steps:
Ans: In SAS we use Numeric and Character data types. Although there are implicit function to work on dates, Dates are also considered as characters.
Ans: The main components used in SAS programming are datasets, statements and variables.
Ans: PROC SORT command is used to sort variables in SAS. It can sort variables even in the groups.
Ans: SAS Framework has four Capabilities. They are: Access, Manage, Analyze and Present.
Ans: Following are the reason for why SAS is used over other data analytics tools:
Ans: Some of the key concepts of SAS are:
Ans: Some of the commonly committed programming errors are:
Ans:
Ans: The special delimiters used in SAS are DSD and DLM.
Ans: To read various data values, SAS uses the command “Informat,” which helps to input or read data from outside files such as FLAT, ASCII, and text files.
Ans: PVD means Program Data Vector. It is the memory area where datasets are created through the SAS system, one at a time. An Input buffer is created when a program is executed. This Input buffer reads the data values and makes them assigned to their respective variables.
Ans: PROC in SAS analyzes and processes data in the form of SAS data set. It controls a library of routines that perform SAS data set options such as Listing, sorting and summariziNg.
INPUT - It is used to describe your variables.
INFILE - It is used to identify an external file.
Ans: With the help of MAXDEC= option we can limit decimal places for the variable.
Ans: A SAS data set is a file consisting of two parts: A descriptor portion and a data portion.
Ans: There are three categories in which SAS informats are placed. They are:
Ans: To input or read the data from external files known as Flat files, text files, ASCII files or sequential files, SAS INFORMATS are used. They tell SAS on how to read data into SAS variables.
Ans: SET statement is used to access data from the existing dataset. From one observation to another, the value of variables will be retained.
Ans: To indicate how to print the variables, we use FORMAT.
To indicate that a number should be read in a particular format, we use INFORMAT.
Ans: OUTPUT statement is used to write the current observation to the SAS data set in the data statement. If there are more than one dataset in the data statement, then you can use the OUTPUT statement to tell SAS which dataset to write in which data statement. Within the data step, if there is no OUTPUT statement, SAS writes the observation when it encounters by default.
Ans: Difference between SAS functions and Procedures is that Functions expect values to be supplied across an observation in SAS data while Procedures expect one variable value for every observation.
Ans:
PROC MEANS | PROC SUMMARY |
PROC MEANS procedures subgroup statistics only when a BY statement is used and the input data has been sorted previously by the By variables. |
PROC SUMMARY produces statistics for all subgroups automatically, giving all the information by repeatedly sorting a data set by the variables that define each subgroup and running PROC MEANS. PROC SUMMARY does not produce any information in the output. So we need to use the OUTPUT statement to create a new DATA SET and PROC PRINT to see the computed statistics. |
PROC MEANS takes all the numerical variables in the analysis by default. | PROC SUMMARY takes the variables that are described in the VAR statement into the statistical analysis. |
Ans:
There are three ways to delete the duplicates using PROC SQL.
Ans. In SAS, the DATA Step refers to the function that helps to build SAS datasets by operating data. Also, it includes the following automatically built variables: _N_ and _ERROR_.
Ans: SLIBREF means server-libref. It specifies the libref which is used by the server to identify the SAS library data when no physical name is determined.
Ans: Including double trailing @@ in Input statements during data step iteration implies that SAS should hold the current record to execute the next Input statement rather than switching onto the new record.
Ans: Scan function is written in the following way:
scan(argument, n, delimiters)
Here argument indicates the character variable or expression, ‘n’ indicates which word to read and delimiters are the special characters that are to be enclosed in single quotes.
Ans: NODUP just compares the BY variable present in the dataset. It checks for observations with duplicate BY values and eliminates them.
NODUPKEY compares all the variables present in the dataset. It considers all the observations. The sort Procedure compares the current observation with the previous observation, when NODUP is specified. If the observations match all the variables, the current observation is left out of the output data set.
Ans: PROC PRINT is used to list the values of the variables in a SAS data set and ensure that data was read correctly into SAS.
PROC CONTENTS displays information about the SAS dataset.
Ans: CEIL returns the smallest integer greater than/equal to the argument whereas FLOOR returns the greatest integer less than/equal to the argument.
Ans: Using the default and specific delimiters, SCAN function returns a given word from a character string. The target variable length assigned by the scan function is 200.
Ans: Command used to find the missing values is
missing_values=MISSING(field1,field2,field3);
Ans: Using double trailing @@ we can create multiple observations from a single observation.
Ans: %INCLUDE statement reads the entire file into the current SAS program we are running and submits the file to the SAS system immediately.
Ans: The scrubbing procedures in SAS are NODUPKEY and Proc Sort option. It eliminates the duplicate values.
Ans: Difference between DO UNTIL and DO WHILE statements is that DO WHILE expression is evaluated at the top of DO loop. DO loop is executed for the first time, if the expression is false, then it will never execute. But DO UNTIL executes at least once.
Ans: It identifies the data set that contains the plot variables. PROC GPLOT has more options so it can create fancier and colourful graphics.
Ans: In SAS, TRANWRD function is used to remove the repeated occurrence of pattern of characters within a character string.
Ans:
Using the following Syntax we can merge or combine datasets in SAS.
Data combined;
Merge data1 data2;
run;
Ans: To create a permanent data set, two steps are involved.
Ans: one-to-one merge is more suitable to combine datasets than a match-merge option.
Ans:
DATA_NULL_ is a data step that does not create any data set. It is used for creating macro variables and writing the output without any data set.
Ans: The purpose of the RETAIN statement is to keep the value once assigned. In a SAS Program, if it is required to move from the current iteration to the next of the data step, then the RETAIN statement tells SAS to retain the values rather than setting them to missing.
Ans: SYMPUT is used to store the data set values into the macro variable, while SYMGET is used to retrieve the value from the macro to the data set.
Ans:
To save logs in the external file, PROC PRINTTO command is used.
PROC PRINTTO log="C:\Users\MyFolder\Downloads\LOG.txt" new;
run;
Ans:
To count the number of intervals between two SAS dates, the interval function INTCK is used.
INTCK(interval,start-of-period,end-of-period)
Ans: The CATX function in SAS places delimiters, deletes dragging and key blanks, and gives back a concatenated text string.
Ans: The function SYSRC offers a system error digit.
Ans: You need to follow a copy statement along with the input and output data library to copy the whole library.
Ans: SAS Data Step includes two automatically built variables such as _N_ and _ERROR_. The _N_ variable observes the number of times that the Data Step repeats and contains the default value 1. But with the repetition of the Data Step, this number will increase. The variable _ERROR_ helps to identify different types of errors, such as mathematical errors, conversion errors, etc. It contains a default value of 0, but it is set to 1 whenever an error occurs.
Ans: The following are the various SAS Functions:-
Scan, Substr, Trim, Compress, CATX, Index, County, etc.
Below are the various SAS procedures:-
Ans: The following are the different ways to build Macro variables in SAS.
Ans. The following are the various SAS system options available to debug or troubleshoot SAS Macros programs:-
Ans. It can be a character data type instead of a numeric data type.
Ans: Interleaving is a SAS feature that integrates several sorted SAS data sets into a single grouped data set. SET and BY statements are used in a DATA step to interleave data sets. The total number of observations across both original and new data sets adds to the number of observations within the new data set.
Ans: The function BOR is SAS, a logical operation that computes bitwise logical OR of the two integers. In other words, giving back bitwise logical OR between the two arguments is helpful.
Ans: To build a compressed data set in SAS, we will choose the COMPRESS= option as a product Data set option or within the statement of Options. If we compress any data set, it will help to minimize its size by minimizing the repetitive characters. It also helps to reduce the vast storage requirements for the data set.
Conclusion:
All the above mentioned questions are the frequently asked SAS interview Questions provided by the Industry Experts. They will help you to crack your SAS interview. However, if you find that any of the Questions is not covered, you are welcome to drop a message in the comments. We will revert to you with the answer.
Batch starts on 11th Dec 2023 |
|
||
Batch starts on 15th Dec 2023 |
|
||
Batch starts on 19th Dec 2023 |
|