Data Wrangling For this portion of the project, you will examine your dataset for incorrect data. Any incorrect data should be removed, corrected, or imput

 

For this portion of the project, you will examine your dataset for incorrect data. Any incorrect data should be removed, corrected, or imputed. Follow these steps:

  • Remove irrelevant data. If you are unsure if it is irrelevant, then keep it.
  • Remove duplicate records that are repeated.
  • Make sure numbers are interpreted as numerical data types.
  • Fix typos.
  • Standardize.
  • Investigate outliers.
  • Check and manage missing values.
  • Format and normalize data if needed.
  • Change categorical values into numbers if needed.

Once you have completed this, you will need to provide a Word document summarizing the pre-processing steps performed on your dataset.

locationdatevariantnum_sequencesperc_sequencesnum_sequences_total
Angola2020-07-06Alpha00.03
Angola2020-07-06B.1.1.27700.03
Angola2020-07-06B.1.1.30200.03
Angola2020-07-06B.1.1.51900.03
Angola2020-07-06B.1.16000.03
Angola2020-07-06B.1.17700.03
Angola2020-07-06B.1.22100.03
Angola2020-07-06B.1.25800.03
Angola2020-07-06B.1.36700.03
Angola2020-07-06B.1.62000.03
Angola2020-07-06Beta00.03
Angola2020-07-06Delta00.03
Angola2020-07-06Epsilon00.03
Angola2020-07-06Eta00.03
Angola2020-07-06Gamma00.03
Angola2020-07-06Iota00.03
Angola2020-07-06Kappa00.03
Angola2020-07-06Lambda00.03
Angola2020-07-06Mu00.03
Angola2020-07-06Omicron00.03
Angola2020-07-06S:677H.Robin100.03
Angola2020-07-06S:677P.Pelican00.03
Angola2020-07-06others3100.03
Angola2020-07-06non_who3100.03
Angola2020-08-31Alpha00.01
Angola2020-08-31B.1.1.27700.01
Angola2020-08-31B.1.1.30200.01
Angola2020-08-31B.1.1.51900.01
Angola2020-08-31B.1.16000.01
Angola2020-08-31B.1.17700.01
Angola2020-08-31B.1.22100.01
Angola2020-08-31B.1.25800.01
Angola2020-08-31B.1.36700.01
Angola2020-08-31B.1.62000.01
Angola2020-08-31Beta1100.01
Angola2020-08-31Delta00.01
Angola2020-08-31Epsilon00.01
Angola2020-08-31Eta00.01
Angola2020-08-31Gamma00.01
Angola2020-08-31Iota00.01
Angola2020-08-31Kappa00.01
Angola2020-08-31Lambda00.01
Angola2020-08-31Mu00.01
Angola2020-08-31Omicron00.01
Angola2020-08-31S:677H.Robin100.01
Angola2020-08-31S:677P.Pelican00.01
Angola2020-08-31others00.01
Angola2020-08-31non_who00.01
Angola2020-09-28Alpha00.010
Angola2020-09-28B.1.1.27700.010
Angola2020-09-28B.1.1.30200.010
Angola2020-09-28B.1.1.51900.010
Angola2020-09-28B.1.16000.010
Angola2020-09-28B.1.17700.010
Angola2020-09-28B.1.22100.010
Angola2020-09-28B.1.25800.010
Angola2020-09-28B.1.36700.010
Angola2020-09-28B.1.62000.010
Angola2020-09-28Beta990.010
Angola2020-09-28Delta00.010
Angola2020-09-28Epsilon00.010
Angola2020-09-28Eta00.010
Angola2020-09-28Gamma00.010
Angola2020-09-28Iota00.010
Angola2020-09-28Kappa00.010
Angola2020-09-28Lambda00.010
Angola2020-09-28Mu00.010
Angola2020-09-28Omicron00.010
Angola2020-09-28S:677H.Robin100.010
Angola2020-09-28S:677P.Pelican00.010
Angola2020-09-28others110.010
Angola2020-09-28non_who110.010
Angola2020-10-12Alpha00.029
Angola2020-10-12B.1.1.27700.029
Angola2020-10-12B.1.1.30200.029
Angola2020-10-12B.1.1.51900.029
Angola2020-10-12B.1.16000.029
Angola2020-10-12B.1.17700.029
Angola2020-10-12B.1.22100.029
Angola2020-10-12B.1.25800.029
Angola2020-10-12B.1.36700.029
Angola2020-10-12B.1.62000.029
Angola2020-10-12Beta2482.7629
Angola2020-10-12Delta00.029
Angola2020-10-12Epsilon00.029
Angola2020-10-12Eta00.029
Angola2020-10-12Gamma00.029
Angola2020-10-12Iota00.029
Angola2020-10-12Kappa00.029
Angola2020-10-12Lambda00.029
Angola2020-10-12Mu00.029
Angola2020-10-12Omicron00.029
Angola2020-10-12S:677H.Robin100.029
Angola2020-10-12S:677P.Pelican00.029
Angola2020-10-12others517.2429
Angola2020-10-12non_who517.2429
Angola2020-10-26Alpha00.07
Angola2020-10-26B.1.1.27700.07
Angola2020-10-26B.1.1.30200.07
Angola2020-10-26B.1.1.51900.07
Angola2020-10-26B.1.16000.07
Angola2020-10-26B.1.17700.07
Angola2020-10-26B.1.22100.07
Angola2020-10-26B.1.25800.07
Angola2020-10-26B.1.36700.07
Angola2020-10-26B.1.62000.07
Angola2020-10-26Beta7100.07
Angola2020-10-26Delta00.07
Angola2020-10-26Epsilon00.07
Angola2020-10-26Eta00.07
Angola2020-10-26Gamma00.07
Angola2020-10-26Iota00.07
Angola2020-10-26Kappa00.07
Angola2020-10-26Lambda00.07
Angola2020-10-26Mu00.07
Angola2020-10-26Omicron00.07
Angola2020-10-26S:677H.Robin100.07
Angola2020-10-26S:677P.Pelican00.07
Angola2020-10-26others00.07
Angola2020-10-26non_who00.07
Angola2020-12-07Alpha00.06
Angola2020-12-07B.1.1.27700.06
Angola2020-12-07B.1.1.30200.06
Angola2020-12-07B.1.1.51900.06
Angola2020-12-07B.1.16000.06
Angola2020-12-07B.1.17700.06
Angola2020-12-07B.1.22100.06
Angola2020-12-07B.1.25800.06
Angola2020-12-07B.1.36700.06
Angola2020-12-07B.1.62000.06
Angola2020-12-07Beta6100.06
Angola2020-12-07Delta00.06
Angola2020-12-07Epsilon00.06
Angola2020-12-07Eta00.06
Angola2020-12-07Gamma00.06
Angola2020-12-07Iota00.06
Angola2020-12-07Kappa00.06
Angola2020-12-07Lambda00.06
Angola2020-12-07Mu00.06
Angola2020-12-07Omicron00.06
Angola2020-12-07S:677H.Robin100.06
Angola2020-12-07S:677P.Pelican00.06
Angola2020-12-07others00.06
Angola2020-12-07non_who00.06
Angola2020-12-21Alpha00.093
Angola2020-12-21B.1.1.27700.093
Angola2020-12-21B.1.1.30200.093
Angola2020-12-21B.1.1.51900.093
Angola2020-12-21B.1.16000.093
Angola2020-12-21B.1.17700.093
Angola2020-12-21B.1.22100.093
Angola2020-12-21B.1.25800.093
Angola2020-12-21B.1.36700.093
Angola2020-12-21B.1.62000.093
Angola2020-12-21Beta6974.1993
Angola2020-12-21Delta00.093
Angola2020-12-21Epsilon00.093
Angola2020-12-21Eta11.0893
Angola2020-12-21Gamma00.093
Angola2020-12-21Iota00.093
Angola2020-12-21Kappa00.093
Angola2020-12-21Lambda00.093
Angola2020-12-21Mu00.093
Angola2020-12-21Omicron00.093
Angola2020-12-21S:677H.Robin100.093
Angola2020-12-21S:677P.Pelican00.093
Angola2020-12-21others2324.7393
Angola2020-12-21non_who2324.7393
Angola2021-01-04Alpha00.05
Angola2021-01-04B.1.1.27700.05
Angola2021-01-04B.1.1.30200.05
Angola2021-01-04B.1.1.51900.05
Angola2021-01-04B.1.16000.05
Angola2021-01-04B.1.17700.05
Angola2021-01-04B.1.22100.05
Angola2021-01-04B.1.25800.05
Angola2021-01-04B.1.36700.05
Angola2021-01-04B.1.62000.05
Angola2021-01-04Beta00.05
Angola2021-01-04Delta00.05
Angola2021-01-04Epsilon00.05
Angola2021-01-04Eta00.05
Angola2021-01-04Gamma00.05
Angola2021-01-04Iota00.05
Angola2021-01-04Kappa00.05
Angola2021-01-04Lambda00.05
Angola2021-01-04Mu00.05
Angola2021-01-04Omicron00.05
Angola2021-01-04S:677H.Robin100.05
Angola2021-01-04S:677P.Pelican00.05
Angola2021-01-04others5100.05
Angola2021-01-04non_who5100.05
Angola2021-01-11Alpha00.016
Angola2021-01-11B.1.1.27700.016
Angola2021-01-11B.1.1.30200.016
Angola2021-01-11B.1.1.51900.016
Angola2021-01-11B.1.16000.016
Angola2021-01-11B.1.17700.016
Angola2021-01-11B.1.22100.016
Angola2021-01-11B.1.25800.016
Angola2021-01-11B.1.36700.016
Angola2021-01-11B.1.62000.016
Angola2021-01-11Beta16.2516
Angola2021-01-11Delta00.016
Angola2021-01-11Epsilon00.016
Angola2021-01-11Eta00.016
Angola2021-01-11Gamma00.016
Angola2021-01-11Iota00.016
Angola2021-01-11Kappa00.016
Angola2021-01-11Lambda00.016
Angola2021-01-11Mu00.016
Angola2021-01-11Omicron00.016
Angola2021-01-11S:677H.Robin100.016
Angola2021-01-11S:677P.Pelican00.016
Angola2021-01-11others1593.7516
Angola2021-01-11non_who1593.7516
Angola2021-01-25Alpha35.7752
Angola2021-01-25B.1.1.27700.052
Angola2021-01-25B.1.1.30200.052
Angola2021-01-25B.1.1.51900.052
Angola2021-01-25B.1.16011.9252
Angola2021-01-25B.1.17723.8552
Angola2021-01-25B.1.22100.052
Angola2021-01-25B.1.25811.9252
Angola2021-01-25B.1.36700.052
Angola2021-01-25B.1.62000.052
Angola2021-01-25Beta23.8552
Angola2021-01-25Delta00.052
Angola2021-01-25Epsilon00.052
Angola2021-01-25Eta00.052
Angola2021-01-25Gamma00.052
Angola2021-01-25Iota00.052
Angola2021-01-25Kappa00.052
Angola2021-01-25Lambda00.052
Angola2021-01-25Mu00.052
Angola2021-01-25Omicron00.052
Angola2021-01-25S:677H.Robin100.052
Angola2021-01-25S:677P.Pelican00.052
Angola2021-01-25others4382.6952
Angola2021-01-25non_who4790.3852
Angola2021-02-08Alpha49.5242
Angola2021-02-08B.1.1.27700.042
Angola2021-02-08B.1.1.30200.042
Angola2021-02-08B.1.1.51900.042
Angola2021-02-08B.1.16012.3842
Angola2021-02-08B.1.17700.042
Angola2021-02-08B.1.22100.042
Angola2021-02-08B.1.25800.042
Angola2021-02-08B.1.36700.042
Angola2021-02-08B.1.62000.042
Angola2021-02-08Beta1228.5742
Angola2021-02-08Delta00.042
Angola2021-02-08Epsilon00.042
Angola2021-02-08Eta00.042
Angola2021-02-08Gamma00.042
Angola2021-02-08Iota00.042
Angola2021-02-08Kappa00.042
Angola2021-02-08Lambda00.042
Angola2021-02-08Mu00.042
Angola2021-02-08Omicron00.042
Angola2021-02-08S:677H.Robin100.042
Angola2021-02-08S:677P.Pelican00.042
Angola2021-02-08others2559.5342
Angola2021-02-08non_who2661.9142
Angola2021-02-22Alpha12.5639
Angola2021-02-22B.1.1.27700.039
Angola2021-02-22B.1.1.30200.039
Angola2021-02-22B.1.1.51900.039
Angola2021-02-22B.1.16012.5639
Angola2021-02-22B.1.17712.5639
Angola2021-02-22B.1.22100.039
Angola2021-02-22B.1.25800.039
Angola2021-02-22B.1.36700.039
Angola2021-02-22B.1.62000.039
Angola2021-02-22Beta1641.0339
Angola2021-02-22Delta00.039
Angola2021-02-22Epsilon00.039
Angola2021-02-22Eta00.039
Angola2021-02-22Gamma00.039
Angola2021-02-22Iota00.039
Angola2021-02-22Kappa00.039
Angola2021-02-22Lambda00.039
Angola2021-02-22Mu00.039
Angola2021-02-22Omicron00.039
Angola2021-02-22S:677H.Robin100.039
Angola2021-02-22S:677P.Pelican00.039
Angola2021-02-22others2051.2939
Angola2021-02-22non_who2256.4139
Angola2021-03-08Alpha76.19113
Angola2021-03-08B.1.1.27700.0113
Angola2021-03-08B.1.1.30200.0113
Angola2021-03-08B.1.1.51900.0113
Angola2021-03-08B.1.16000.0113
Angola2021-03-08B.1.17710.88113
Angola2021-03-08B.1.22100.0113
Angola2021-03-08B.1.25800.0113
Angola2021-03-08B.1.36700.0113
Angola2021-03-08B.1.62000.0113
Angola2021-03-08Beta6961.06113
Angola2021-03-08Delta00.0113
Angola2021-03-08Epsilon00.0113
Angola2021-03-08Eta00.0113
Angola2021-03-08Gamma00.0113
Angola2021-03-08Iota00.0113
Angola2021-03-08Kappa00.0113
Angola2021-03-08Lambda00.0113
Angola2021-03-08Mu00.0113
Angola2021-03-08Omicron00.0113
Angola2021-03-08S:677H.Robin100.0113
Angola2021-03-08S:677P.Pelican00.0113
Angola2021-03-08others3631.87113
Angola2021-03-08non_who3732.75113
Angola2021-03-22Alpha129.92121
Angola2021-03-22B.1.1.27700.0121
Angola2021-03-22B.1.1.30200.0121
Angola2021-03-22B.1.1.51900.0121
Angola2021-03-22B.1.16000.0121
Angola2021-03-22B.1.17700.0121
Angola2021-03-22B.1.22100.0121

Looking for this or a Similar Assignment? Click below to Place your Order