I'm trying to find a more efficient to calculate the percent a field is populated and repeat it for each field (column). First, I'm counting the number of lines: lines <- as.integer(countLines(extract) - 1) dput(lines) 100000L extract <- 'C:/Users/jeffjohn/Desktop/batchextract_100k_sample.csv' mydf <- read.csv(file = extract, header = TRUE) Here's the list of columns in my file:> dput(colnames(mydf))c("PERSONPROFILE_POS", "PARTY_ID", "PERSON_FIRST_NAME", "PERSON_LAST_NAME", "PERSON_MIDDLE_NAME", "PARTY_NUMBER", "ACCOUNT_NUMBER", "ABILITEC_LINK", "ADDRESS1", "ADDRESS2", "ADDRESS3", "ADDRESS4", "CITY", "COUNTY", "STATE", "PROVINCE", "POSTAL_CODE", "COUNTRY", "PRIMARY_PER_TYPE", "SELLTOADDR_LOS", "LOCATION_ID", "SELLTOADDR_SOS", "PARTY_SITE_ID", "PRIMARYPHONE_CPOS", "CONTACT_POINT_ID_PCP", "CONTACT_POINT_PURPOSE_PCP", "PHONE_LINE_TYPE", "PRIMARY_FLAG_PCP", "PHONE_COUNTRY_CODE", "PHONE_AREA_CODE", "PHONE_NUMBER", "EMAIL_CPOS", "CONTACT_POINT_ID_ECP", "CONTACT_POINT_PURPOSE_ECP", "PRIMARY_FLAG_ECP", "EMAIL_ADDRESS", "BB_PARTY_ID") I want to count the percentage populated for each field. Rather than do: percent(length(is.null(mydf$PERSONPROFILE_POS)) / lines) percent(length(is.null(mydf$PARTY_ID)) / lines) etc. and repeat for each field manually, I want to use a for loop. I am trying the following: a <- length(colnames(mydf)) # this is to get the total number of columns for (i in 1:a) print((percent(length(is.null(a)) / lines)) which isn't correct. I'm new to programming, so I don't quite know how to deal with this. Any suggestions? Thanks much. -- Jeff [[alternative HTML version deleted]]