It is often said that data manipulation alone takes 50-70% time of a data science project.
The duration of this never-ending activity can be attributed to our opaqueness with datasets provided.
Functions in this package enable familiarity with the data.frame to further reduce coding errors and re-work.
# The easiest way to get dplyr is to install from GitHub:
install.packages("dataframeexplorer", dependencies = T)
# Alternatively, you can install development version:
install.packages("devtools")
::install_github("ashrithssreddy/dataframeexplorer") devtools
Functions: [x] Percentiles [x] Level of dataset Univariate Analysis Bivariate Analysis Show progress bar for level_of_dataset Run the level_of_dataset code in parallel for performance
Changes: Return value for all functions to be included into
documentation Message not printed in all codes
Default filename not consistent Outputs not refined Pep 8
formatting examples not consistent Comments not consistent across all
codes sink() to be run in glimpse_to_file upon an error Arguement
format to be used: dataset = dataset, output_filename =
“dataset_glimpse.txt” Throw a warning when duplicate column names
are found Level: Unsink when interrupted Add
instructions to interpretation of output
1. glimpse_to_file
glimpse_to_file(mtcars, "mtcars_glimpse.txt") or glimpse_to_file(mtcars, "C://Users/Desktop/mtcars_glimpse.txt")
![Output](/man/figures/.png)
Mail ashrithssreddy@gmail.com for suggestions with “dataframeexplorer” in subject line.