Description
SESSION JUL – AUG 2024
PROGRAM MASTER OF BUSINESS ADMINISTRATION (MBA)
SEMESTER III
COURSE CODE & NAME DADS301 PROGRAMMING IN DATA SCIENCE
Assignment Set – 1
1. (a) What is Data wrangling? Name the package used for Data wrangling in R and
describe some of its features.
(b) What are vectors? Explain the creation of vectors with examples. Also, describe how
to identify and handle missing values.
Ans 1. Data Wrangling and Vectors in R
(a) Data Wrangling
Data wrangling, also known as data munging, is the process of cleaning, structuring, and
enriching raw data into a desired format for better decision-making. This process is critical in
data analysis as it ensures data is accurate, complete, and usable. The raw data collected from
different sources often contains errors, inconsistencies, and missing values, which can hinder
analysis. Data wrangling involves several steps such as data cleaning, transformation,
normalization, merging, and enrichment.
In R, one of the most popular packages used for data wrangling is the dplyr package, which is
Its Half solved only
Buy Complete assignment from us
Price – 190/ assignment
MUJ Manipal University Complete
SolvedAssignments session JULY-AUG 2024
buy cheap assignment help online from us easily
we are here to help you with the best and cheap help
Contact No – 8791514139 (WhatsApp)
OR
Mail us- [email protected]
Our website – www.assignmentsupport.in
2. (a) Describe the steps to initialize a plot in R, specify aesthetics, create a simple plot,
and add titles and labels to the plot in R.
(b) Explain the chaining operator with an example.
Ans 2. Plot Initialization and Chaining Operators in R
(a) Initializing a Plot in R
Visualization is an essential part of data analysis, and R provides robust tools for creating
plots. One of the most popular libraries for plotting in R is ggplot2, which is also part of the
tidyverse. Creating a plot in ggplot2 involves several steps:
1. Load the Library: Before creating a plot, load the ggplot2 library.
library(ggplot2)
2. Initialize the Plot and Specify Aesthetics: The ggplot() function initializes the plot, and
3. (a) Explain with an example how box plots can be used to understand the relationship
between continuous and categorical feature. What are the insights that can be derived
from such plots.
(b) What is continuous random variable. How can that be created using R?
Ans 3. Box Plots and Continuous Random Variables
(a) Understanding Box Plots
Box plots are a powerful graphical tool used to summarize the distribution of a continuous
variable while considering a categorical feature. They are also known as whisker plots and
provide insights into data spread, central tendency, and potential outliers.
A box plot displays the following components:
1. Median: The line inside the box represents the median of the data.
2. Interquartile Range (IQR): The box itself spans the first quartile (Q1) to the third
quartile (Q3), representing the middle 50% of the data.
Assignment Set – 2
4. (a) Describe the need for Python and applications of Python.
(b) Explain with examples – Set, List and Tuples. What are the similarities and
differences among them? 5+5
Ans 4. Python Applications and Data Structures
(a) The Need for Python
Python has emerged as one of the most popular programming languages due to its simplicity,
versatility, and extensive libraries. Its growing prominence across industries stems from its
ability to handle a variety of tasks, from web development to data analysis.
Why Python is Needed:
1. Ease of Learning: Python’s syntax is clean and resembles natural language, making
it accessible to beginners.
2. Extensive Libraries: Libraries like NumPy, Pandas, TensorFlow, and Matplotlib
5. (a) How are strings converted into iterables? Explain with an example how the
iterables thus created can be iterated through.
(b) Explain how simple and complex pattern searches can be performed on lists.
Ans 5. Strings as Iterables and Pattern Searches on Lists
(a) Strings as Iterables
In Python, strings are inherently iterable, meaning you can traverse each character in a string
using an iterator or a loop. Converting a string into an iterable involves using the built-in
iter() function, which creates an iterator object from the string. This iterator can then be used
to traverse the string one character at a time.
Example:
# Converting a string into an iterable
my_string = “Python”
6. (a) What are Stacked Bar charts? When is it used? Explain with an example.
(b) Explain the merging of two data frames with similar name for the ‘join’ column and
dissimilar names for the ‘join’ column.
Ans 6. Stacked Bar Charts and Merging Data Frames
(a) Stacked Bar Charts
Stacked bar charts are a type of bar chart used to visualize the contribution of different
components to a total over multiple categories. In a stacked bar chart, each bar is divided into
segments representing individual components, stacked on top of each other. This allows for a
clear comparison of both total values and individual contributions across categories.
When to Use:
1. Category Comparison: To compare total values across categories.
2. Component Analysis: To examine the distribution of sub-categories within each
category.
Reviews
There are no reviews yet.