thr3ads.net - R help - [R] Splitting a vector into data frame [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Burhan ul haq

2016-Mar-24 10:30 UTC

[R] Splitting a vector into data frame

Hi,

1. I have scraped some data from the web, subset shown below
> dput(temp.data)c("Armenia", "Armenia", "43827",
"39200", "35700", "36700", "39341",
"30571", "0", "0", "0", "0",
"0", "0", "0", "0", "0",
"0", " 0",
"0", "0", "0", "0", "Austria",
"Austria", "135417", "166200",
"144500", "147300", "163211", "162536",
"155412", "133667", "134962",
"146440", "131188", "100001", "100000",
"80000", "35000")

2. The corresponding list of countries, is as follows
> dput(raw.country)c("Armenia", "Austria", "Belarus",
"Belgium", "Brazil", "Bulgaria",
"Canada", "Castile-Leon (Hiszania)", "Catalonia",
"Chile", "Colombia",
"Costarica", "Croatia", "Cyprus", "Czech
Republic", "Ecuador",
"Estonia", "Finland", "France",
"Georgia", "Germany", "Ghana",
"Greece", "Hungary", "Indonesia",
"Iran", "Ireland", "Israel",
"Italy", "Kazakhstan", "Kyrgyzstan",
"Latvia", "Lithuania", "Macedonia",
"Malaysia", "Mexico", "Moldova",
"Mongolia", "Netherland", "Norway",
"Pakistan", "Panama", "Paraguay",
"Peru", "Poland", "Portugal",
"Puertorico", "Romania", "Russia",
"Serbia", "Slovakia", "Slovenia",
"Spain", "Sweden", "Switzerland",
"Tunisia", "Ukraine", "United Kingdom",
"USA", "Venezuela", "Vltava", "World
Total")


3. I want to organize the data into a data frame, where each row will
contain the 20 values for the corresponding country.
It needs to ignore the country name which appears twice.Something like:

Armenia "43827", "39200", "35700",
"36700", "39341",
"30571", "0", "0", "0", "0",
"0", "0", "0", "0", "0",
"0", " 0",
"0", "0", "0", "0",

"Austria", "135417", "166200",
"144500", "147300", "163211", "162536",
"155412", "133667", "134962",
"146440", "131188", "100001", "100000",
"80000", "35000"

and so on


Thanks /

	[[alternative HTML version deleted]]

Boris Steipe

2016-Mar-24 10:40 UTC

head link

[R] Splitting a vector into data frame

Your data rows have different numbers of columns. Thus your problem is not
sufficiently specified.

B. 
On Mar 24, 2016, at 6:30 AM, Burhan ul haq <ulhaqz at gmail.com> wrote:
> Hi,
> 
> 1. I have scraped some data from the web, subset shown below
> 
>> dput(temp.data)
> c("Armenia", "Armenia", "43827",
"39200", "35700", "36700", "39341",
> "30571", "0", "0", "0",
"0", "0", "0", "0", "0",
"0", "0", " 0",
> "0", "0", "0", "0",
"Austria", "Austria", "135417",
"166200",
> "144500", "147300", "163211",
"162536", "155412", "133667", "134962",
> "146440", "131188", "100001",
"100000", "80000", "35000")
> 
> 2. The corresponding list of countries, is as follows
> 
>> dput(raw.country)
> c("Armenia", "Austria", "Belarus",
"Belgium", "Brazil", "Bulgaria",
> "Canada", "Castile-Leon (Hiszania)",
"Catalonia", "Chile", "Colombia",
> "Costarica", "Croatia", "Cyprus", "Czech
Republic", "Ecuador",
> "Estonia", "Finland", "France",
"Georgia", "Germany", "Ghana",
> "Greece", "Hungary", "Indonesia",
"Iran", "Ireland", "Israel",
> "Italy", "Kazakhstan", "Kyrgyzstan",
"Latvia", "Lithuania", "Macedonia",
> "Malaysia", "Mexico", "Moldova",
"Mongolia", "Netherland", "Norway",
> "Pakistan", "Panama", "Paraguay",
"Peru", "Poland", "Portugal",
> "Puertorico", "Romania", "Russia",
"Serbia", "Slovakia", "Slovenia",
> "Spain", "Sweden", "Switzerland",
"Tunisia", "Ukraine", "United Kingdom",
> "USA", "Venezuela", "Vltava", "World
Total")
> 
> 
> 3. I want to organize the data into a data frame, where each row will
> contain the 20 values for the corresponding country.
> It needs to ignore the country name which appears twice.Something like:
> 
> Armenia "43827", "39200", "35700",
"36700", "39341",
> "30571", "0", "0", "0",
"0", "0", "0", "0", "0",
"0", "0", " 0",
> "0", "0", "0", "0",
> 
> "Austria", "135417", "166200",
> "144500", "147300", "163211",
"162536", "155412", "133667", "134962",
> "146440", "131188", "100001",
"100000", "80000", "35000"
> 
> and so on
> 
> 
> Thanks /
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Jim Lemon

2016-Mar-24 10:48 UTC

head link

[R] Splitting a vector into data frame

Hi Burhan,
As all of your values seem to be character, perhaps:

country.df<-as.data.frame(matrix(temp.data,ncol=22,byrow=TRUE)[,2:21])

if there really are 2 country names and 20 values for each country. As
Boris has pointed out, there are different numbers of values following
the country names in your example.

Jim


On Thu, Mar 24, 2016 at 9:30 PM, Burhan ul haq <ulhaqz at gmail.com>
wrote:> Hi,
>
> 1. I have scraped some data from the web, subset shown below
>
>> dput(temp.data)
> c("Armenia", "Armenia", "43827",
"39200", "35700", "36700", "39341",
> "30571", "0", "0", "0",
"0", "0", "0", "0", "0",
"0", "0", " 0",
> "0", "0", "0", "0",
"Austria", "Austria", "135417",
"166200",
> "144500", "147300", "163211",
"162536", "155412", "133667", "134962",
> "146440", "131188", "100001",
"100000", "80000", "35000")
>
> 2. The corresponding list of countries, is as follows
>
>> dput(raw.country)
> c("Armenia", "Austria", "Belarus",
"Belgium", "Brazil", "Bulgaria",
> "Canada", "Castile-Leon (Hiszania)",
"Catalonia", "Chile", "Colombia",
> "Costarica", "Croatia", "Cyprus", "Czech
Republic", "Ecuador",
> "Estonia", "Finland", "France",
"Georgia", "Germany", "Ghana",
> "Greece", "Hungary", "Indonesia",
"Iran", "Ireland", "Israel",
> "Italy", "Kazakhstan", "Kyrgyzstan",
"Latvia", "Lithuania", "Macedonia",
> "Malaysia", "Mexico", "Moldova",
"Mongolia", "Netherland", "Norway",
> "Pakistan", "Panama", "Paraguay",
"Peru", "Poland", "Portugal",
> "Puertorico", "Romania", "Russia",
"Serbia", "Slovakia", "Slovenia",
> "Spain", "Sweden", "Switzerland",
"Tunisia", "Ukraine", "United Kingdom",
> "USA", "Venezuela", "Vltava", "World
Total")
>
>
> 3. I want to organize the data into a data frame, where each row will
> contain the 20 values for the corresponding country.
> It needs to ignore the country name which appears twice.Something like:
>
> Armenia "43827", "39200", "35700",
"36700", "39341",
> "30571", "0", "0", "0",
"0", "0", "0", "0", "0",
"0", "0", " 0",
> "0", "0", "0", "0",
>
> "Austria", "135417", "166200",
> "144500", "147300", "163211",
"162536", "155412", "133667", "134962",
> "146440", "131188", "100001",
"100000", "80000", "35000"
>
> and so on
>
>
> Thanks /
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Ivan Calandra

2016-Mar-24 10:53 UTC

head link

[R] Splitting a vector into data frame

Hi!

As Boris explained, if you do not always have the same number of values 
per country, you need to provide more details, e.g. should the empty 
cells be filled with NA?

But if you do always have 20 values per country (unlike in your sample 
data), then this could work for you:
mydf <- data.frame(matrix(temp.data, nrow=2, ncol=22, byrow=TRUE))
You can then subset to remove the 1st column:
mydf[-1]

HTH,
Ivan

--
Ivan Calandra, PhD
University of Reims Champagne-Ardenne
GEGENAA - EA 3795
CREA - 2 esplanade Roland Garros
51100 Reims, France
+33(0)3 26 77 36 89
ivan.calandra at univ-reims.fr
--
https://www.researchgate.net/profile/Ivan_Calandra
https://publons.com/author/705639/

Le 24/03/2016 11:30, Burhan ul haq a ?crit :> Hi,
>
> 1. I have scraped some data from the web, subset shown below
>
>> dput(temp.data)
> c("Armenia", "Armenia", "43827",
"39200", "35700", "36700", "39341",
> "30571", "0", "0", "0",
"0", "0", "0", "0", "0",
"0", "0", " 0",
> "0", "0", "0", "0",
"Austria", "Austria", "135417",
"166200",
> "144500", "147300", "163211",
"162536", "155412", "133667", "134962",
> "146440", "131188", "100001",
"100000", "80000", "35000")
>
> 2. The corresponding list of countries, is as follows
>
>> dput(raw.country)
> c("Armenia", "Austria", "Belarus",
"Belgium", "Brazil", "Bulgaria",
> "Canada", "Castile-Leon (Hiszania)",
"Catalonia", "Chile", "Colombia",
> "Costarica", "Croatia", "Cyprus", "Czech
Republic", "Ecuador",
> "Estonia", "Finland", "France",
"Georgia", "Germany", "Ghana",
> "Greece", "Hungary", "Indonesia",
"Iran", "Ireland", "Israel",
> "Italy", "Kazakhstan", "Kyrgyzstan",
"Latvia", "Lithuania", "Macedonia",
> "Malaysia", "Mexico", "Moldova",
"Mongolia", "Netherland", "Norway",
> "Pakistan", "Panama", "Paraguay",
"Peru", "Poland", "Portugal",
> "Puertorico", "Romania", "Russia",
"Serbia", "Slovakia", "Slovenia",
> "Spain", "Sweden", "Switzerland",
"Tunisia", "Ukraine", "United Kingdom",
> "USA", "Venezuela", "Vltava", "World
Total")
>
>
> 3. I want to organize the data into a data frame, where each row will
> contain the 20 values for the corresponding country.
> It needs to ignore the country name which appears twice.Something like:
>
> Armenia "43827", "39200", "35700",
"36700", "39341",
> "30571", "0", "0", "0",
"0", "0", "0", "0", "0",
"0", "0", " 0",
> "0", "0", "0", "0",
>
> "Austria", "135417", "166200",
> "144500", "147300", "163211",
"162536", "155412", "133667", "134962",
> "146440", "131188", "100001",
"100000", "80000", "35000"
>
> and so on
>
>
> Thanks /
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

R help - Mar 2016 - Splitting a vector into data frame

[R] Splitting a vector into data frame

[R] Splitting a vector into data frame

[R] Splitting a vector into data frame

[R] Splitting a vector into data frame