thr3ads.net - R help - [R] Error in .subset(x, j) : only 0's may be mixed with negative subscripts [Jun 2009]

If this information is useful, please help other people find it:
Share via:

Russell Ivory

2009-Jun-23 17:18 UTC

[R] Error in .subset(x, j) : only 0's may be mixed with negative subscripts

I have a data set called datastep4  with 211484 rows and 95 columns

 
> dim(datastep4)
[1] 211484     95

 

The first few column names are given below, note the first one is
"RESPONDED"

 
> names(datastep4)[1:5]
[1] "RESPONDED" "VAR_30"    "VAR_31"   
"VAR_32"    "VAR_33"

 

A table of RESPONDED shows mostly zeros

 
> table(datastep4$RESPONDED)
 

     0      1

210582    902

 

 

I reduce the data set by pulling out the RESPONDED column, then verify
all is well

 
> test <- datastep4[,-datastep4$RESPONDED]
> dim(test)
[1] 211484     94
> names(test)[1:5]
[1] "VAR_30" "VAR_31" "VAR_32" "VAR_33"
"VAR_34"
> class(test)
[1] "data.frame"
> test[1:10,1:10]
   VAR_30 VAR_31 VAR_32 VAR_33 VAR_34 VAR_37 VAR_38 VAR_42 VAR_45 VAR_46

1       0      0      0      0  15198      0      0      6     NA

3       0      0      0      0   8491      0      0      4     NA

4       0      0      0      0      0      0      0      0     NA

5       0      0      0      0  67671      0      0      7     NA

7       0      0      0      0   1334      0      0      1     NA

9       0      0      0      0      0      0      0      2     NA

10      0      0      0      0  24169      0      0     10     NA

11      0      0      0      0    438      0      0      3     NA

12      0      0      0      0   2158      0      0      1     NA

13      0      0      0      0  18804      0      0      4     NA
> 
 

If I reduce the data frame datastep4 by removing a few records where the
variable G102 is not 1, and removing the column named "G102" (which is
column 84),

I end up with a smaller set called datastep5 with 192701 rows and 94
columns

 
> datastep5 <- datastep4[datastep4$G102 != 1,-84]
> 
> dim(datastep5)
[1] 192701     94
> names(datastep5)[1:5]
[1] "RESPONDED" "VAR_30"    "VAR_31"   
"VAR_32"    "VAR_33"
> table(datastep5$RESPONDED)
 

     0      1

141096    584

 

 

Now, if I want to reduce this data set by removing the RESPONDED column
as was done for datastep4, it blows up

 
> test <- datastep5[,-datastep5$RESPONDED]
Error in .subset(x, j) : only 0's may be mixed with negative subscripts
> 
 

 

I cannot find anything different about this new data set, other than a
few rows and column G104 removed and the error message it too vague.
Any ideas are greatly appreciated!

 

Russell Ivory

Merrick Bank

South Jordan, UT

 

 

 

****************************************************************************This
e-mail and any files transmitted with it are confidential and are intended
solely for the use of the individual or entity to whom it is addressed. If you
are not the intended recipient or the person responsible for delivering the
e-mail to the intended recipient, be advised that you have received this e-mail
in error, and that any use, dissemination, forwarding, printing, or copying of
this e-mail is strictly prohibited. If you received this e-mail in error, please
return the e-mail to the sender at Merrick Bank and delete it from your
computer. Although Merrick Bank attempts to sweep e-mail and attachments for
viruses, it does not guarantee that either are virus-free and accepts no
liability for any damage sustained as a result of viruses.

	[[alternative HTML version deleted]]

David Winsemius

2009-Jun-23 21:10 UTC

head link

[R] Error in .subset(x, j) : only 0's may be mixed with negative subscripts

On Jun 23, 2009, at 1:18 PM, Russell Ivory wrote:
> I have a data set called datastep4  with 211484 rows and 95 columns
>
WHY ALL OF THE UNNEEDED EMPTY LINES???>
>> dim(datastep4)
>
> [1] 211484     95
>
> The first few column names are given below, note the first one is
> "RESPONDED"
>
>> names(datastep4)[1:5]
>
> [1] "RESPONDED" "VAR_30"    "VAR_31"   
"VAR_32"    "VAR_33"
>
> A table of RESPONDED shows mostly zeros
>
>> table(datastep4$RESPONDED)
>
>     0      1
>
> 210582    902
>
> I reduce the data set by pulling out the RESPONDED column, then verify
> all is well
>
>> test <- datastep4[,-datastep4$RESPONDED]
It may have "worked" but perhaps not for the reasons you thought it  
should. Take a look carefully at this

 > str(data2)
'data.frame':	300 obs. of  5 variables:
  $ x1 : num  0.0592 0.3976 0.9512 0.675 0.7129 ...
  $ x2 : num  0.625 0.328 0.721 0.779 0.233 ...
  $ y  : num  0.685 0.694 1.589 1.461 0.921 ...
  $ grp: Factor w/ 3 levels "A","B","C": 1 1 1 1 1
1 1 1 1 1 ...
  $ one: num  1 1 1 1 1 1 1 1 1 1 ...

 > str(data2[,-data2$one])
'data.frame':	300 obs. of  4 variables:
  $ x2 : num  0.625 0.328 0.721 0.779 0.233 ...
  $ y  : num  0.685 0.694 1.589 1.461 0.921 ...
  $ grp: Factor w/ 3 levels "A","B","C": 1 1 1 1 1
1 1 1 1 1 ...
  $ one: num  1 1 1 1 1 1 1 1 1 1 ...

Notice that the "one" column was _not_ removed.

>> dim(test)
>
> [1] 211484     94
>
>> names(test)[1:5]
>
> [1] "VAR_30" "VAR_31" "VAR_32"
"VAR_33" "VAR_34"
>
>> class(test)
>
> [1] "data.frame"
>
>> test[1:10,1:10]
>
>   VAR_30 VAR_31 VAR_32 VAR_33 VAR_34 VAR_37 VAR_38 VAR_42 VAR_45  
> VAR_46
>
> 1       0      0      0      0  15198      0      0      6     NA
>
> 3       0      0      0      0   8491      0      0      4     NA
>
> 4       0      0      0      0      0      0      0      0     NA
>
> 5       0      0      0      0  67671      0      0      7     NA
>
> 7       0      0      0      0   1334      0      0      1     NA
>
> 9       0      0      0      0      0      0      0      2     NA
>
> 10      0      0      0      0  24169      0      0     10     NA
>
> 11      0      0      0      0    438      0      0      3     NA
>
> 12      0      0      0      0   2158      0      0      1     NA
>
> 13      0      0      0      0  18804      0      0      4     NA
>
>>
>
> If I reduce the data frame datastep4 by removing a few records where  
> the
> variable G102 is not 1, and removing the column named "G102"
(which is
> column 84),
>
> I end up with a smaller set called datastep5 with 192701 rows and 94
> columns
>
>> datastep5 <- datastep4[datastep4$G102 != 1,-84]
>
This code does the _opposite_ of what you stated. It selects only  
those records that are not equal to 1. (And if that is not an integer  
type column the results could be further seen as undetermined,)
>>
>
>> dim(datastep5)
>
> [1] 192701     94
>
>> names(datastep5)[1:5]
>
> [1] "RESPONDED" "VAR_30"    "VAR_31"   
"VAR_32"    "VAR_33"
>
>> table(datastep5$RESPONDED)
>
>      0      1
>
> 141096    584
>
>
> Now, if I want to reduce this data set by removing the RESPONDED  
> column
> as was done for datastep4, it blows up
>
>> test <- datastep5[,-datastep5$RESPONDED]
I am guessing that the first element of datastep5$RESPONDED is now a  
zero. You are abusing the indexing conventions. Try instead either:

test <- datastep5[,-1]

Or if you want to imagine that you cannot remember the column number  
of "RESPONDED" then this will "work":

test <- datastep5[ , -which(names(datastep5)=="RESPONDED")]
>
> Error in .subset(x, j) : only 0's may be mixed with negative  
> subscripts
> Merrick Bank confidentiality trailed elided
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

R help - Jun 2009 - Error in .subset(x, j) : only 0's may be mixed with negative subscripts

[R] Error in .subset(x, j) : only 0's may be mixed with negative subscripts

[R] Error in .subset(x, j) : only 0's may be mixed with negative subscripts

Seemingly Similar Threads