Hi, I'm having a problem with my labels. I am reading in a data file: df <- read.csv(file = 'batch1extract_100k_sample.csv') However, it's producing two sets of labels:> labels(df)[[1]] [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" [22] "22" "23" "24" "25" "26" "27" "28" "29" "30" "31" "32" "33" "34" "35" "36" "37" "38" "39" "40" "41" "42" [43] "43" "44" "45" "46" "47" "48" "49" "50" "51" "52" "53" "54" "55" "56" "57" "58" "59" "60" "61" "62" "63" [64] "64" "65" "66" "67" "68" "69" "70" "71" "72" "73" "74" "75" "76" "77" "78" "79" "80" "81" "82" "83" "84" [85] "85" "86" "87" "88" "89" "90" "91" "92" "93" "94" "95" "96" "97" "98" "99" [[2]] [1] "PERSONPROFILE_POS" "PARTY_ID" "PERSON_FIRST_NAME" [4] "PERSON_LAST_NAME" "PERSON_MIDDLE_NAME" "PARTY_NUMBER" [7] "ACCOUNT_NUMBER" "ABILITEC_LINK" "ADDRESS1" [10] "ADDRESS2" "ADDRESS3" "ADDRESS4" [13] "CITY" "COUNTY" "STATE" [16] "PROVINCE" "POSTAL_CODE" "COUNTRY" [19] "PRIMARY_PER_TYPE" "SELLTOADDR_LOS" "LOCATION_ID" [22] "SELLTOADDR_SOS" "PARTY_SITE_ID" "PRIMARYPHONE_CPOS" [25] "CONTACT_POINT_ID_PCP" "CONTACT_POINT_PURPOSE_PCP" "PHONE_LINE_TYPE" [28] "PRIMARY_FLAG_PCP" "PHONE_COUNTRY_CODE" "PHONE_AREA_CODE" [31] "PHONE_NUMBER" "EMAIL_CPOS" "CONTACT_POINT_ID_ECP" [34] "CONTACT_POINT_PURPOSE_ECP" "PRIMARY_FLAG_ECP" "EMAIL_ADDRESS" [37] "BB_PARTY_ID" Notice I get 2 rows for the labels: the first row is a list of numbers (which does not appear in my dataset) and the second row which are my actual labels. I have no idea why it's returning all of the numbers in the labels command. They're definitely not there in the input file. Any suggestions? Thank you! [[alternative HTML version deleted]]
Hello, According to the help page for ?labels, for a data.frame it's simply the dimnames, meaning, the row names (your numbers) and the column names. Note that read.csv returns a data.frame, not a matrix, and data frames allways have row names, typically numbers. I wouldn't worry about it. Hope this helps, Rui Barradas Em 09-01-2014 18:29, Jeff Johnson escreveu:> Hi, I'm having a problem with my labels. > > I am reading in a data file: > df <- read.csv(file = 'batch1extract_100k_sample.csv') > > However, it's producing two sets of labels: > >> labels(df) > [[1]] > [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" > "15" "16" "17" "18" "19" "20" "21" > [22] "22" "23" "24" "25" "26" "27" "28" "29" "30" "31" "32" "33" "34" "35" > "36" "37" "38" "39" "40" "41" "42" > [43] "43" "44" "45" "46" "47" "48" "49" "50" "51" "52" "53" "54" "55" "56" > "57" "58" "59" "60" "61" "62" "63" > [64] "64" "65" "66" "67" "68" "69" "70" "71" "72" "73" "74" "75" "76" "77" > "78" "79" "80" "81" "82" "83" "84" > [85] "85" "86" "87" "88" "89" "90" "91" "92" "93" "94" "95" "96" "97" "98" > "99" > > [[2]] > [1] "PERSONPROFILE_POS" "PARTY_ID" > "PERSON_FIRST_NAME" > [4] "PERSON_LAST_NAME" "PERSON_MIDDLE_NAME" "PARTY_NUMBER" > > [7] "ACCOUNT_NUMBER" "ABILITEC_LINK" "ADDRESS1" > > [10] "ADDRESS2" "ADDRESS3" "ADDRESS4" > > [13] "CITY" "COUNTY" "STATE" > > [16] "PROVINCE" "POSTAL_CODE" "COUNTRY" > > [19] "PRIMARY_PER_TYPE" "SELLTOADDR_LOS" "LOCATION_ID" > > [22] "SELLTOADDR_SOS" "PARTY_SITE_ID" > "PRIMARYPHONE_CPOS" > [25] "CONTACT_POINT_ID_PCP" "CONTACT_POINT_PURPOSE_PCP" > "PHONE_LINE_TYPE" > [28] "PRIMARY_FLAG_PCP" "PHONE_COUNTRY_CODE" > "PHONE_AREA_CODE" > [31] "PHONE_NUMBER" "EMAIL_CPOS" > "CONTACT_POINT_ID_ECP" > [34] "CONTACT_POINT_PURPOSE_ECP" "PRIMARY_FLAG_ECP" > "EMAIL_ADDRESS" > [37] "BB_PARTY_ID" > > > Notice I get 2 rows for the labels: the first row is a list of numbers > (which does not appear in my dataset) and the second row which are my > actual labels. > > I have no idea why it's returning all of the numbers in the labels command. > They're definitely not there in the input file. Any suggestions? > Thank you! > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Hi Jeff, If you read the help for labels(), it says that for a dataframe it returns the dimnames: the first component is the row names, which by default are numbers, and the second component of the list is the column names. Since you appear to want just the latter, you could use colnames(df) instead. (But be careful: df is also the name of a function, and it's easy to confuse things.) Sarah On Thu, Jan 9, 2014 at 1:29 PM, Jeff Johnson <mrjefftoyou at gmail.com> wrote:> Hi, I'm having a problem with my labels. > > I am reading in a data file: > df <- read.csv(file = 'batch1extract_100k_sample.csv') > > However, it's producing two sets of labels: > >> labels(df) > [[1]] > [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" > "15" "16" "17" "18" "19" "20" "21" > [22] "22" "23" "24" "25" "26" "27" "28" "29" "30" "31" "32" "33" "34" "35" > "36" "37" "38" "39" "40" "41" "42" > [43] "43" "44" "45" "46" "47" "48" "49" "50" "51" "52" "53" "54" "55" "56" > "57" "58" "59" "60" "61" "62" "63" > [64] "64" "65" "66" "67" "68" "69" "70" "71" "72" "73" "74" "75" "76" "77" > "78" "79" "80" "81" "82" "83" "84" > [85] "85" "86" "87" "88" "89" "90" "91" "92" "93" "94" "95" "96" "97" "98" > "99" > > [[2]] > [1] "PERSONPROFILE_POS" "PARTY_ID" > "PERSON_FIRST_NAME" > [4] "PERSON_LAST_NAME" "PERSON_MIDDLE_NAME" "PARTY_NUMBER" > > [7] "ACCOUNT_NUMBER" "ABILITEC_LINK" "ADDRESS1" > > [10] "ADDRESS2" "ADDRESS3" "ADDRESS4" > > [13] "CITY" "COUNTY" "STATE" > > [16] "PROVINCE" "POSTAL_CODE" "COUNTRY" > > [19] "PRIMARY_PER_TYPE" "SELLTOADDR_LOS" "LOCATION_ID" > > [22] "SELLTOADDR_SOS" "PARTY_SITE_ID" > "PRIMARYPHONE_CPOS" > [25] "CONTACT_POINT_ID_PCP" "CONTACT_POINT_PURPOSE_PCP" > "PHONE_LINE_TYPE" > [28] "PRIMARY_FLAG_PCP" "PHONE_COUNTRY_CODE" > "PHONE_AREA_CODE" > [31] "PHONE_NUMBER" "EMAIL_CPOS" > "CONTACT_POINT_ID_ECP" > [34] "CONTACT_POINT_PURPOSE_ECP" "PRIMARY_FLAG_ECP" > "EMAIL_ADDRESS" > [37] "BB_PARTY_ID" > > > Notice I get 2 rows for the labels: the first row is a list of numbers > (which does not appear in my dataset) and the second row which are my > actual labels. > > I have no idea why it's returning all of the numbers in the labels command. > They're definitely not there in the input file. Any suggestions? > Thank you! >-- Sarah Goslee http://www.functionaldiversity.org