R has gained lot of momentum in the last few years for Data Science. At first , for a SQL professional , this may be bit daunting ; however there are lot of similarities between the RDBMS concepts and R concepts , that will make the learning curve tad easier. One of the similarity is the data frame.
Data frame is one of the important component in R to capture the data from the external data sources ( aka importing from CSV , loading from RDMS , and so on ) .
It is conceptually same as the a table in a RDBMS system.
In the following , I have created a data frame with 3 elements and 6 rows.
In RDBMS , this is the same as creating a table 'emp' and inserting 6 records.
emp <- data.frame="" span="">->
name=c("Zahir","Farook","Hameed","Basheer","Aslam","Suhaib"),
deptno=c(10,20,30,30,20,20),
city=c("Monroe","Trichy","Kilakarai","Kilakarai","Chennai","Chennai"))
When the data frame is referenced at the prompt , it returns the entire data set. This is similar to "SELECT * FROM EMP",
The function "rbind" is used to insert a record into the existing dataset.
This is similar to "INSERT INTO EMP values ('Karady' , 100 , 'Colombo') "
emp <- arady="" data.frame="" deptno="c(100),city=c(" emp="" name="c(" olombo="" rbind="" span="">->
The function "nrow" is used to get the record count of the dataset.
This is similar to " SELECT count(*) from EMP".
The function "ncol" is used to get the record count of the columns.
This is similar to " SELECT count(*) from information_schema.columns where table_name =EMP'" .
With the following example , we are filtering the records that have deptno = 30 . This is similar to
"SELECT * FROM EMP WHERE DEPTNO= 30'.
We can add , additional filter with the pipe function . Pipe is used for 'OR' condition.
This is similar to "SELECT * FROM EMP WHERE DEPTNO= 30 ORCITY ='Chennai'.
As we can see , there are lot of similarites in with the concept of Table (tuple) and the dataframe.
This could be a starting point to get familiar with R for a SQL professional .
I understand , I have just scratched the surface on the data frame and its functions.
As of now , Oracle and MS SQL Server has incoprated 'R' into their offerings.
Comments Welcome.
No comments:
Post a Comment