Posts

LIS 4317 Final Project: Fuel Economy Data from the U.S Dept. of Energy

Image
  LIS 4317 Final Project: Fuel Economy Data from the U.S Dept. of Energy Investigating the association between vehicle features and CO2 emissions in a dataset comprising fuel efficiency data is the issue statement for my final project. The objective is to ascertain the extent to which CO2 emissions differ among various vehicle classes. In order to evaluate the effects of automobiles on the environment and pinpoint possible areas where fuel efficiency regulations could be strengthened, further investigation is essential. The idea suggests that specific vehicle classes, like larger cars or ones with bigger engines, might have more CO2 emissions than other car classes. This issue is set within the larger framework of transportation research and environmental sustainability. The relationship between vehicle characteristics and emissions, particularly CO2 emissions, has been well studied in the past. Numerous techniques, such as statistical evaluations and graphics, have been used to in...

LIS 4317: Module #13 Assignment

Image
  Module 13 Assignment A powerful visual representation of random sampling from a uniform distribution is provided by the animation produced with the animation package and the R programming language. The animation shows a plot of ten randomly selected values from the uniform distribution in each frame. The y-axis boundaries are always set between 0 and 1 to facilitate comparisons. As the animation goes on, viewers will be able to see how the generated numbers are distributed randomly and with variability; some frames show clustered dots, while others show a more scattered distribution. An interactive and dynamic element can be added to presentations, blog posts, or instructional materials by saving the animation as a GIF. This makes it easily shareable and a useful tool for presenting ideas connected to random sampling.  The plot() method is used to create scatter plots, and the Sys.sleep() function adds a little delay between frames to make sure the animation is visible. The ...

LIS 4317: Module # 12

Image
  Module # 12 I was able to install and load the necessary packages, such as GGally, igraph, and ggplot2, for network visualization. The network was successfully visualized using the ggnet2 function from the GGally package, and the erdos.renyi.game function from the igraph package was successfully used to generate a random network. Because of this, the random network may be successfully visualized using ggnet2, giving rise to a rudimentary depiction of the network topology. Throughout the procedure, there were a number of difficulties and mistakes. I first ran into issues when trying to use functions from the network package, like rgraph and as.network, because I was using the wrong functions and didn't have the necessary dependencies. Moreover, I have made the mistake of applying functions from the network and SNA packages when they are not required for the visualization process, which has caused confusion and needless difficulties. Despite several significant obstacles and setbac...

LIS 4317: Module # 11 Assignment

Image
  Module # 11 Assignment I created a marginal histogram scatter plot for this assignment using R and the ggplot2 program. ggplot2 and ggExtra were among the packages I first loaded and installed. Next, I generated statistics that showed annual budget expenditures per capita. The scatter plot was then created using ggplot2, maintaining a basic style, and adding a linear regression line for trend visualization. ggExtra's ggMarginal function was utilized to add marginal histograms to the plot. The histogram type of marginal plots was specified, and their presence was guaranteed on both axes.  This procedure made it possible to create a thorough visualization that combined summaries of the marginal distribution with insights from scatter plots, enabling a deeper comprehension of the properties of the data.

LIS 4370: Module # 11 Debugging and defensive programming in R

  Module # 11 Debugging and defensive programming in R Bugged Code: tukey_multiple <- function(x) {    outliers <- array(TRUE,dim=dim(x))    for (j in 1:ncol(x))     {     outliers[,j] <- outliers[,j] && tukey.outlier(x[,j])     } outlier.vec <- vector(length=nrow(x))     for (i in 1:nrow(x))     { outlier.vec[i] <- all(outliers[i,]) } return(outlier.vec) } Corrected Code: tukey_multiple <- function(x) {   outliers <- array(FALSE, dim = dim(x))  # Corrected initialization   for (j in 1:ncol(x)) {     outliers[, j] <- tukey.outlier(x[, j])  # Corrected logic   }   outlier.vec <- vector(length = nrow(x))   for (i in 1:nrow(x)) {      outlier.vec[i] <- any(outliers[i, ])  # Corrected logic   }    return(outlier.vec)  } Explanation: Immediately as I looked over the code, I saw that the loop's upd...

LIS 4317: Module # 10 assignment

Image
  Module # 10 assignment Time series data is important in many fields, such as finance, economics, and weather forecasting. It is defined as observations taken at repeated time intervals. Finding patterns, drawing conclusions, and coming to wise decisions all depend on the visualization of such data. First, we take a look at the 'economics' dataset that is integrated into ggplot2, which includes economic indicators for a number of years. This dataset contains variables like population, unemployment rate, and median length of unemployed. Using a time series graphic of the unemployment rate, we first investigate patterns and variations in the dataset.  By charting the median length of unemployment across time, we can see possible long-term trends or seasonal patterns. Furthermore, we explore combination visualization methods, comparing and contrasting several time series plots for investigation of correlation or comparability. The flexible framework of ggplot2 allows analysts an...

LIS 4370: Module # 10 Building your own R package

Module # 10 Building your own R package  GitHub Link:  https://github.com/agremer/LIS4370/blob/main/DESCRIPTION%20File DECRIPTION File : Package : DataVisTools Title : Alec Gremer's Data Visualization Test Implementation Version : 0.1.0.9000 Authors@R : "Alec Gremer, agremer@usf.edu [aut, cre]" Description : DataVisTools is an R package designed to streamline data visualization tasks by offering a versatile set of functions. It will include interactive visualization features, customizable plot aesthetics, statistical visualization tools, geospatial mapping capabilities, and time series analysis functions. Comprehensive documentation and an MIT License will ensure ease of use and widespread adoption. Implementation will prioritize S3 classes and methods for flexibility.  Depends : R (>= 3.1.2) License : CC0 LazyData : true For the final project, I proposed an R package named DataVisTools, which will serve as a comprehensive toolkit for data visualization tasks. Thi...