[R-es] aumentar tamaño de memoria a mas de 4Gb‏

An embedded and charset-unspecified text was scrubbed...
Name: no disponible
URL: <https://stat.ethz.ch/pipermail/r-help-es/attachments/20100318/d917a39e/attachment.pl>
Hola, Â¿quÃ© tal?

Â¿Sabes exactamente dÃ³nde se produce el error? Â¿Es al construir el
modelo o al hacer la predicciÃ³n?

Si sucede lo primero, Â¿de quÃ© tamaÃ±o es tu conjunto de entrenamiento?

Un saludo,

Carlos J. Gil Bellosta
http://www.datanalytics.com

El dÃa 18 de marzo de 2010 10:43, VÃctor RodrÃguez Galiano
<luxorvrg en hotmail.com> escribiÃ³:
Hola de nuevo,

Esta es la informaciÃ³n de mi sesion:

R version 2.10.1 (2009-12-14)
i386-pc-mingw32

locale:
[1] LC_COLLATE=Spanish_Spain.1252 Â LC_CTYPE=Spanish_Spain.1252
[3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
[5] LC_TIME=Spanish_Spain.1252

attached base packages:
[1] stats Â  Â  graphics Â grDevices utils Â  Â  datasets Â methods Â  base

Lo que yo prentendo es clasificar unos datos que estÃ¡n en ficheros de texto, el tamaÃ±o de cada uno de estos ficheros va desde los 10mb a 90mb. En un fichero de texto (input) le indico los ficheros que tiene que ir cogiendo para predecir y en el archivo output los nombres de los ficheros de salida con las clasificaciones. Como os comentaba antes, con el clasificador randomForest no tengo problema pero si con support vector. Cuando trabajo con los datos de entrenamiento y test no hay ningÃºn problema. El problema surge al intentar clasificar nuevos ejemplos.

Os adjunto tambiÃ©n el codigo que meto para clasificar por si :

# R script for running random forest classification model and prediction for many segments/areas
# Need to run calibration only once for full model and then run prediction in a loop for different segments/areas/regions
###################################################################################################################
# Part 1: calibration

library(e1071)

#calibration step
calibrate<-read.table("calibration.txt", header=TRUE)
calibrate$calibration<-as.factor(calibrate$calibration)
calibrate.rf<-svm(calibration~B1+B14+B15+B16+B17+B18+B19+B20+B21+B24+B25+B26+B51+B52+B53+B54+B55+B56+B57+B58+B59+B60+B61+B62, data=calibrate, cost=6.8, gamma=0.08)
####################################################################################################################

####################################################################################################################
# Part 2: Automated Prediction

#R automated prediction step for support vector
#Note: first you need to calibrate the model separately and then run this script for different image segments/areas
#Note: this script requires two input text files called input.txt and output.txt
#The first line of input.txt gives the header, the second line the number of input segments (eg. bands and elevation values) and then the later lines list the names of the input segments with txt extension
#The first line of output.txt gives the header, the second line the number of output segments which is predicted by the classifier and then the later lines list the names of the output predicted segments with txt extension

# reading the parameter files
input<-read.table("input.txt", header=TRUE)
output<-read.table("output.txt", header=TRUE)

# no_elements for 1 and 2 should be the same
no_elements1<-as.integer(toString(input$para1[1]))
no_elements2<-as.integer(toString(input$para2[1]))

# increasing the memory limit to 4 MB
memory.limit(size=4000)

for (i in 1:no_elements1) {
Â input_name<-toString(input$para1[i+1])
Â predict<-read.table(input_name, header=TRUE)
Â predValues<-predict(calibrate.rf, predict)
Â predValues<-as.numeric(predValues)
Â output_name<-toString(output$para2[i+1])
Â write.table(predValues, output_name, row.names=FALSE, col.names=output_name)
}

Lo que muestra la funcion str() es lo siguiente:

str(output)
'data.frame': Â  2 obs. of Â 1 variable:
Â $ para2: Factor w/ 2 levels "1","2H_map.txt": 1 2
str(input)
'data.frame': Â  2 obs. of Â 1 variable:
Â $ para1: Factor w/ 2 levels "1","2H.txt": 1 2

_________________________________________________________________
Â¿Te gustarÃa tener Hotmail en tu mÃ³vil Movistar? Â¡Es gratis!
http://serviciosmoviles.es.msn.com/hotmail/movistar-particulares.aspx
Â  Â  Â  Â [[alternative HTML version deleted]]

_______________________________________________
R-help-es mailing list
R-help-es en r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

An embedded and charset-unspecified text was scrubbed...
Name: no disponible
URL: <https://stat.ethz.ch/pipermail/r-help-es/attachments/20100318/a56fceb7/attachment.pl>
Hola, Â¿quÃ© tal?

Es raro porque para predecir apenas hacen falta recursos. Pero bueno,
tienes dos opciones relativamente sencillas.

1) La primera, es fÃ¡cil que no funcione: dentro de tu bucle, forzar
las llamadas al recolector de basura con gc().

2) La segunda, deberÃa funcionar sÃ o sÃ: en lugar de hacer la
predicciÃ³n sobre un conjunto de datos muy grande, partirlo en varios
pequeÃ±os a mano, hacer la predicciÃ³n "cacho a cacho" y apilar los
resultados convenientemente.

Un saludo,

Carlos J. Gil Bellosta
http://www.datanalytics.com

El dÃa 18 de marzo de 2010 11:22, VÃctor RodrÃguez Galiano
<luxorvrg en hotmail.com> escribiÃ³:
El error se produce al hacer la predicciÃ³n.

Date: Thu, 18 Mar 2010 11:19:12 +0100
Subject: Re: [R-es] aumentar tamaÃ±o de memoria a mas de 4Gb?
From: cgb en datanalytics.com
To: luxorvrg en hotmail.com
CC: r-help-es en r-project.org

Hola, Â¿quÃ© tal?

Â¿Sabes exactamente dÃ³nde se produce el error? Â¿Es al construir el
modelo o al hacer la predicciÃ³n?

Si sucede lo primero, Â¿de quÃ© tamaÃ±o es tu conjunto de entrenamiento?

Un saludo,

Carlos J. Gil Bellosta
http://www.datanalytics.com

El dÃa 18 de marzo de 2010 10:43, VÃctor RodrÃguez Galiano
<luxorvrg en hotmail.com> escribiÃ³:
Hola de nuevo,

Esta es la informaciÃ³n de mi sesion:

R version 2.10.1 (2009-12-14)
i386-pc-mingw32

locale:
[1] LC_COLLATE=Spanish_Spain.1252 Â LC_CTYPE=Spanish_Spain.1252
[3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
[5] LC_TIME=Spanish_Spain.1252

attached base packages:
[1] stats Â  Â  graphics Â grDevices utils Â  Â  datasets Â methods Â  base

Lo que yo prentendo es clasificar unos datos que estÃ¡n en ficheros de texto, el tamaÃ±o de cada uno de estos ficheros va desde los 10mb a 90mb. En un fichero de texto (input) le indico los ficheros que tiene que ir cogiendo para predecir y en el archivo output los nombres de los ficheros de salida con las clasificaciones. Como os comentaba antes, con el clasificador randomForest no tengo problema pero si con support vector. Cuando trabajo con los datos de entrenamiento y test no hay ningÃºn problema. El problema surge al intentar clasificar nuevos ejemplos.

Os adjunto tambiÃ©n el codigo que meto para clasificar por si :

# R script for running random forest classification model and prediction for many segments/areas
# Need to run calibration only once for full model and then run prediction in a loop for different segments/areas/regions
###################################################################################################################
# Part 1: calibration

library(e1071)

#calibration step
calibrate<-read.table("calibration.txt", header=TRUE)
calibrate$calibration<-as.factor(calibrate$calibration)
calibrate.rf<-svm(calibration~B1+B14+B15+B16+B17+B18+B19+B20+B21+B24+B25+B26+B51+B52+B53+B54+B55+B56+B57+B58+B59+B60+B61+B62, data=calibrate, cost=6.8, gamma=0.08)
####################################################################################################################

####################################################################################################################
# Part 2: Automated Prediction

#R automated prediction step for support vector
#Note: first you need to calibrate the model separately and then run this script for different image segments/areas
#Note: this script requires two input text files called input.txt and output.txt
#The first line of input.txt gives the header, the second line the number of input segments (eg. bands and elevation values) and then the later lines list the names of the input segments with txt extension
#The first line of output.txt gives the header, the second line the number of output segments which is predicted by the classifier and then the later lines list the names of the output predicted segments with txt extension

# reading the parameter files
input<-read.table("input.txt", header=TRUE)
output<-read.table("output.txt", header=TRUE)

# no_elements for 1 and 2 should be the same
no_elements1<-as.integer(toString(input$para1[1]))
no_elements2<-as.integer(toString(input$para2[1]))

# increasing the memory limit to 4 MB
memory.limit(size=4000)

for (i in 1:no_elements1) {
Â input_name<-toString(input$para1[i+1])
Â predict<-read.table(input_name, header=TRUE)
Â predValues<-predict(calibrate.rf, predict)
Â predValues<-as.numeric(predValues)
Â output_name<-toString(output$para2[i+1])
Â write.table(predValues, output_name, row.names=FALSE, col.names=output_name)
}

Lo que muestra la funcion str() es lo siguiente:

str(output)
'data.frame': Â  2 obs. of Â 1 variable:
Â $ para2: Factor w/ 2 levels "1","2H_map.txt": 1 2
str(input)
'data.frame': Â  2 obs. of Â 1 variable:
Â $ para1: Factor w/ 2 levels "1","2H.txt": 1 2

_________________________________________________________________
Â¿Te gustarÃa tener Hotmail en tu mÃ³vil Movistar? Â¡Es gratis!
http://serviciosmoviles.es.msn.com/hotmail/movistar-particulares.aspx
Â  Â  Â  Â [[alternative HTML version deleted]]

_______________________________________________
R-help-es mailing list
R-help-es en r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

_________________________________________________________________
Ahora Messenger en tu BlackberryÂ® 8520 con Movistar por 0 ?. Â¿A quÃ© esperas?
http://serviciosmoviles.es.msn.com/messenger/blackberry.aspx
Â  Â  Â  Â [[alternative HTML version deleted]]

_______________________________________________
R-help-es mailing list
R-help-es en r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Hola, Â¿quÃ© tal?

Es raro porque para predecir apenas hacen falta recursos. Pero bueno,
tienes dos opciones relativamente sencillas.

1) La primera, es fÃ¡cil que no funcione: dentro de tu bucle, forzar
las llamadas al recolector de basura con gc().

2) La segunda, deberÃa funcionar sÃ o sÃ: en lugar de hacer la
predicciÃ³n sobre un conjunto de datos muy grande, partirlo en varios
pequeÃ±os a mano, hacer la predicciÃ³n "cacho a cacho" y apilar los
resultados convenientemente.

Un saludo,

Carlos J. Gil Bellosta
http://www.datanalytics.com

El dÃa 18 de marzo de 2010 11:22, VÃctor RodrÃguez Galiano
<luxorvrg en hotmail.com> escribiÃ³:
El error se produce al hacer la predicciÃ³n.

Date: Thu, 18 Mar 2010 11:19:12 +0100
Subject: Re: [R-es] aumentar tamaÃ±o de memoria a mas de 4Gb?
From: cgb en datanalytics.com
To: luxorvrg en hotmail.com
CC: r-help-es en r-project.org

Hola, Â¿quÃ© tal?

Â¿Sabes exactamente dÃ³nde se produce el error? Â¿Es al construir el
modelo o al hacer la predicciÃ³n?

Si sucede lo primero, Â¿de quÃ© tamaÃ±o es tu conjunto de entrenamiento?

Un saludo,

Carlos J. Gil Bellosta
http://www.datanalytics.com

El dÃa 18 de marzo de 2010 10:43, VÃctor RodrÃguez Galiano
<luxorvrg en hotmail.com> escribiÃ³:
Hola de nuevo,

Esta es la informaciÃ³n de mi sesion:

R version 2.10.1 (2009-12-14)
i386-pc-mingw32

locale:
[1] LC_COLLATE=Spanish_Spain.1252 Â LC_CTYPE=Spanish_Spain.1252
[3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
[5] LC_TIME=Spanish_Spain.1252

attached base packages:
[1] stats Â  Â  graphics Â grDevices utils Â  Â  datasets Â methods Â  base

Lo que yo prentendo es clasificar unos datos que estÃ¡n en ficheros de texto, el tamaÃ±o de cada uno de estos ficheros va desde los 10mb a 90mb. En un fichero de texto (input) le indico los ficheros que tiene que ir cogiendo para predecir y en el archivo output los nombres de los ficheros de salida con las clasificaciones. Como os comentaba antes, con el clasificador randomForest no tengo problema pero si con support vector. Cuando trabajo con los datos de entrenamiento y test no hay ningÃºn problema. El problema surge al intentar clasificar nuevos ejemplos.

Os adjunto tambiÃ©n el codigo que meto para clasificar por si :

# R script for running random forest classification model and prediction for many segments/areas
# Need to run calibration only once for full model and then run prediction in a loop for different segments/areas/regions
###################################################################################################################
# Part 1: calibration

library(e1071)

#calibration step
calibrate<-read.table("calibration.txt", header=TRUE)
calibrate$calibration<-as.factor(calibrate$calibration)
calibrate.rf<-svm(calibration~B1+B14+B15+B16+B17+B18+B19+B20+B21+B24+B25+B26+B51+B52+B53+B54+B55+B56+B57+B58+B59+B60+B61+B62, data=calibrate, cost=6.8, gamma=0.08)
####################################################################################################################

####################################################################################################################
# Part 2: Automated Prediction

#R automated prediction step for support vector
#Note: first you need to calibrate the model separately and then run this script for different image segments/areas
#Note: this script requires two input text files called input.txt and output.txt
#The first line of input.txt gives the header, the second line the number of input segments (eg. bands and elevation values) and then the later lines list the names of the input segments with txt extension
#The first line of output.txt gives the header, the second line the number of output segments which is predicted by the classifier and then the later lines list the names of the output predicted segments with txt extension

# reading the parameter files
input<-read.table("input.txt", header=TRUE)
output<-read.table("output.txt", header=TRUE)

# no_elements for 1 and 2 should be the same
no_elements1<-as.integer(toString(input$para1[1]))
no_elements2<-as.integer(toString(input$para2[1]))

# increasing the memory limit to 4 MB
memory.limit(size=4000)

for (i in 1:no_elements1) {
Â input_name<-toString(input$para1[i+1])
Â predict<-read.table(input_name, header=TRUE)
Â predValues<-predict(calibrate.rf, predict)
Â predValues<-as.numeric(predValues)
Â output_name<-toString(output$para2[i+1])
Â write.table(predValues, output_name, row.names=FALSE, col.names=output_name)
}

Lo que muestra la funcion str() es lo siguiente:

str(output)
'data.frame': Â  2 obs. of Â 1 variable:
Â $ para2: Factor w/ 2 levels "1","2H_map.txt": 1 2
str(input)
'data.frame': Â  2 obs. of Â 1 variable:
Â $ para1: Factor w/ 2 levels "1","2H.txt": 1 2

_________________________________________________________________
Â¿Te gustarÃa tener Hotmail en tu mÃ³vil Movistar? Â¡Es gratis!
http://serviciosmoviles.es.msn.com/hotmail/movistar-particulares.aspx
Â  Â  Â  Â [[alternative HTML version deleted]]

_______________________________________________
R-help-es mailing list
R-help-es en r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

_________________________________________________________________
Ahora Messenger en tu BlackberryÂ® 8520 con Movistar por 0 ?. Â¿A quÃ© esperas?
http://serviciosmoviles.es.msn.com/messenger/blackberry.aspx
Â  Â  Â  Â [[alternative HTML version deleted]]

_______________________________________________
R-help-es mailing list
R-help-es en r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Una tercera opciÃ³n siguiendo el hilo de propuestas de Carlos consiste
a muestrear el conjunto de entrenamiento, y llevar un anÃ¡lisis de  
variaciÃ³n de los resultados versus el tamaÃ±o muestral.
Pues, es muy probable que no necesites toda la informaciÃ³n para  
conseguir un modelo de predicciÃ³n fiable.
Este de tipo de situaciÃ³n es parte de la vocaciÃ³n del muestreo.

Un saludo. Olivier
--  
____________________________________

Olivier G. NuÃ±ez
Email: onunez en iberstat.es
Tel : +34 663 03 69 09
Web: http://www.iberstat.es

____________________________________

El 18/03/2010, a las 11:22, VÃctor RodrÃguez Galiano escribiÃ³:
El error se produce al hacer la predicciÃ³n.

Date: Thu, 18 Mar 2010 11:19:12 +0100
Subject: Re: [R-es] aumentar tamaÃ±o de memoria a mas de 4Gb?
From: cgb en datanalytics.com
To: luxorvrg en hotmail.com
CC: r-help-es en r-project.org

Hola, Â¿quÃ© tal?

Â¿Sabes exactamente dÃ³nde se produce el error? Â¿Es al construir el
modelo o al hacer la predicciÃ³n?

Si sucede lo primero, Â¿de quÃ© tamaÃ±o es tu conjunto de 
entrenamiento?

Un saludo,

Carlos J. Gil Bellosta
http://www.datanalytics.com

El dÃa 18 de marzo de 2010 10:43, VÃctor RodrÃguez Galiano
<luxorvrg en hotmail.com> escribiÃ³:
Hola de nuevo,

Esta es la informaciÃ³n de mi sesion:

R version 2.10.1 (2009-12-14)
i386-pc-mingw32

locale:
[1] LC_COLLATE=Spanish_Spain.1252  LC_CTYPE=Spanish_Spain.1252
[3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
[5] LC_TIME=Spanish_Spain.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

Lo que yo prentendo es clasificar unos datos que estÃ¡n en  
ficheros de texto, el tamaÃ±o de cada uno de estos ficheros va  
desde los 10mb a 90mb. En un fichero de texto (input) le indico  
los ficheros que tiene que ir cogiendo para predecir y en el  
archivo output los nombres de los ficheros de salida con las  
clasificaciones. Como os comentaba antes, con el clasificador  
randomForest no tengo problema pero si con support vector. Cuando  
trabajo con los datos de entrenamiento y test no hay ningÃºn  
problema. El problema surge al intentar clasificar nuevos ejemplos.

Os adjunto tambiÃ©n el codigo que meto para clasificar por si :

# R script for running random forest classification model and  
prediction for many segments/areas
# Need to run calibration only once for full model and then run  
prediction in a loop for different segments/areas/regions
#################################################################### 
###############################################
# Part 1: calibration

library(e1071)

#calibration step
calibrate<-read.table("calibration.txt", header=TRUE)
calibrate$calibration<-as.factor(calibrate$calibration)
calibrate.rf<-svm(calibration~B1+B14+B15+B16+B17+B18+B19+B20+B21 
+B24+B25+B26+B51+B52+B53+B54+B55+B56+B57+B58+B59+B60+B61+B62,  
data=calibrate, cost=6.8, gamma=0.08)
#################################################################### 
################################################

#################################################################### 
################################################
# Part 2: Automated Prediction

#R automated prediction step for support vector
#Note: first you need to calibrate the model separately and then  
run this script for different image segments/areas
#Note: this script requires two input text files called input.txt  
and output.txt
#The first line of input.txt gives the header, the second line  
the number of input segments (eg. bands and elevation values) and  
then the later lines list the names of the input segments with  
txt extension
#The first line of output.txt gives the header, the second line  
the number of output segments which is predicted by the  
classifier and then the later lines list the names of the output  
predicted segments with txt extension

# reading the parameter files
input<-read.table("input.txt", header=TRUE)
output<-read.table("output.txt", header=TRUE)

# no_elements for 1 and 2 should be the same
no_elements1<-as.integer(toString(input$para1[1]))
no_elements2<-as.integer(toString(input$para2[1]))

# increasing the memory limit to 4 MB
memory.limit(size=4000)

for (i in 1:no_elements1) {
 input_name<-toString(input$para1[i+1])
 predict<-read.table(input_name, header=TRUE)
 predValues<-predict(calibrate.rf, predict)
 predValues<-as.numeric(predValues)
 output_name<-toString(output$para2[i+1])
 write.table(predValues, output_name, row.names=FALSE,  
col.names=output_name)
}

Lo que muestra la funcion str() es lo siguiente:

str(output)
'data.frame':   2 obs. of  1 variable:
 $ para2: Factor w/ 2 levels "1","2H_map.txt": 1 2
str(input)
'data.frame':   2 obs. of  1 variable:
 $ para1: Factor w/ 2 levels "1","2H.txt": 1 2

_________________________________________________________________
Â¿Te gustarÃa tener Hotmail en tu mÃ³vil Movistar? Â¡Es gratis!
http://serviciosmoviles.es.msn.com/hotmail/movistar- 
particulares.aspx
       [[alternative HTML version deleted]]

_______________________________________________
R-help-es mailing list
R-help-es en r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

_________________________________________________________________
Ahora Messenger en tu BlackberryÂ® 8520 con Movistar por 0 ?. Â¿A  
quÃ© esperas?
http://serviciosmoviles.es.msn.com/messenger/blackberry.aspx
	[[alternative HTML version deleted]]

_______________________________________________
R-help-es mailing list
R-help-es en r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es
Eh... Igual he leÃdo mal: si el problema estÃ¡ "al hacer la
predicciÃ³n", entiendo que el modelo ya estÃ¡ construido y que se
construye sin problemas. El haber construido el modelo con 1.000 o
100.000 observaciones, Â¿quÃ© cambia a la hora de hacer predicciones
sobre un conjunto de datos nuevo?

Un saludo,

Carlos J. Gil Bellosta
http://www.datanalytics.com

El dÃa 18 de marzo de 2010 11:42, Olivier NuÃ±ez <onunez en iberstat.es> escribiÃ³:
Una tercera opciÃ³n siguiendo el hilo de propuestas de Carlos consiste
a muestrear el conjunto de entrenamiento, y llevar un anÃ¡lisis de variaciÃ³n
de los resultados versus el tamaÃ±o muestral.
Pues, es muy probable que no necesites toda la informaciÃ³n para conseguir un
modelo de predicciÃ³n fiable.
Este de tipo de situaciÃ³n es parte de la vocaciÃ³n del muestreo.

Un saludo. Olivier
-- ____________________________________

Olivier G. NuÃ±ez
Email: onunez en iberstat.es
Tel : +34 663 03 69 09
Web: http://www.iberstat.es

____________________________________

El 18/03/2010, a las 11:22, VÃctor RodrÃguez Galiano escribiÃ³:

El error se produce al hacer la predicciÃ³n.

Date: Thu, 18 Mar 2010 11:19:12 +0100
Subject: Re: [R-es] aumentar tamaÃ±o de memoria a mas de 4Gb?
From: cgb en datanalytics.com
To: luxorvrg en hotmail.com
CC: r-help-es en r-project.org

Hola, Â¿quÃ© tal?

Â¿Sabes exactamente dÃ³nde se produce el error? Â¿Es al construir el
modelo o al hacer la predicciÃ³n?

Un saludo,

Carlos J. Gil Bellosta
http://www.datanalytics.com

El dÃa 18 de marzo de 2010 10:43, VÃctor RodrÃguez Galiano
<luxorvrg en hotmail.com> escribiÃ³:

Hola de nuevo,

Esta es la informaciÃ³n de mi sesion:

R version 2.10.1 (2009-12-14)
i386-pc-mingw32

locale:
[1] LC_COLLATE=Spanish_Spain.1252 Â LC_CTYPE=Spanish_Spain.1252
[3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
[5] LC_TIME=Spanish_Spain.1252

attached base packages:
[1] stats Â  Â  graphics Â grDevices utils Â  Â  datasets Â methods Â  base

Lo que yo prentendo es clasificar unos datos que estÃ¡n en ficheros de
texto, el tamaÃ±o de cada uno de estos ficheros va desde los 10mb a 90mb. En
un fichero de texto (input) le indico los ficheros que tiene que ir cogiendo
para predecir y en el archivo output los nombres de los ficheros de salida
con las clasificaciones. Como os comentaba antes, con el clasificador
randomForest no tengo problema pero si con support vector. Cuando trabajo
con los datos de entrenamiento y test no hay ningÃºn problema. El problema
surge al intentar clasificar nuevos ejemplos.

# R script for running random forest classification model and prediction
for many segments/areas
# Need to run calibration only once for full model and then run
prediction in a loop for different segments/areas/regions

###################################################################################################################
# Part 1: calibration

library(e1071)

#calibration step
calibrate<-read.table("calibration.txt", header=TRUE)
calibrate$calibration<-as.factor(calibrate$calibration)

calibrate.rf<-svm(calibration~B1+B14+B15+B16+B17+B18+B19+B20+B21+B24+B25+B26+B51+B52+B53+B54+B55+B56+B57+B58+B59+B60+B61+B62,
data=calibrate, cost=6.8, gamma=0.08)

####################################################################################################################

####################################################################################################################
# Part 2: Automated Prediction

#R automated prediction step for support vector
#Note: first you need to calibrate the model separately and then run
this script for different image segments/areas
#Note: this script requires two input text files called input.txt and
output.txt
#The first line of input.txt gives the header, the second line the
number of input segments (eg. bands and elevation values) and then the later
lines list the names of the input segments with txt extension
#The first line of output.txt gives the header, the second line the
number of output segments which is predicted by the classifier and then the
later lines list the names of the output predicted segments with txt
extension

# reading the parameter files
input<-read.table("input.txt", header=TRUE)
output<-read.table("output.txt", header=TRUE)

# no_elements for 1 and 2 should be the same
no_elements1<-as.integer(toString(input$para1[1]))
no_elements2<-as.integer(toString(input$para2[1]))

# increasing the memory limit to 4 MB
memory.limit(size=4000)

for (i in 1:no_elements1) {
Â input_name<-toString(input$para1[i+1])
Â predict<-read.table(input_name, header=TRUE)
Â predValues<-predict(calibrate.rf, predict)
Â predValues<-as.numeric(predValues)
Â output_name<-toString(output$para2[i+1])
Â write.table(predValues, output_name, row.names=FALSE,
col.names=output_name)
}

Lo que muestra la funcion str() es lo siguiente:

str(output)

'data.frame': Â  2 obs. of Â 1 variable:
Â $ para2: Factor w/ 2 levels "1","2H_map.txt": 1 2

str(input)

'data.frame': Â  2 obs. of Â 1 variable:
Â $ para1: Factor w/ 2 levels "1","2H.txt": 1 2

_________________________________________________________________
Â¿Te gustarÃa tener Hotmail en tu mÃ³vil Movistar? Â¡Es gratis!
http://serviciosmoviles.es.msn.com/hotmail/movistar-particulares.aspx
Â  Â  Â  [[alternative HTML version deleted]]

_______________________________________________
R-help-es mailing list
R-help-es en r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

_________________________________________________________________
Ahora Messenger en tu BlackberryÂ® 8520 con Movistar por 0 ?. Â¿A quÃ©
esperas?
http://serviciosmoviles.es.msn.com/messenger/blackberry.aspx
Â  Â  Â  Â [[alternative HTML version deleted]]

_______________________________________________
R-help-es mailing list
R-help-es en r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

_______________________________________________
R-help-es mailing list
R-help-es en r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Cierto Carlos.
LeÃ tu mensaje con demasiada prisa.
En todo caso, me cuesta pensar que el problema no estÃ© en la  
construcciÃ³n del modelo.
Un saludo. Olivier
--  
____________________________________

Olivier G. NuÃ±ez
Email: onunez en iberstat.es
Tel : +34 663 03 69 09
Web: http://www.iberstat.es

____________________________________

El 18/03/2010, a las 11:53, Carlos J. Gil Bellosta escribiÃ³:
Eh... Igual he leÃdo mal: si el problema estÃ¡ "al hacer la
predicciÃ³n", entiendo que el modelo ya estÃ¡ construido y que se
construye sin problemas. El haber construido el modelo con 1.000 o
100.000 observaciones, Â¿quÃ© cambia a la hora de hacer predicciones
sobre un conjunto de datos nuevo?

Un saludo,

Carlos J. Gil Bellosta
http://www.datanalytics.com

El dÃa 18 de marzo de 2010 11:42, Olivier NuÃ±ez  
<onunez en iberstat.es> escribiÃ³:
Una tercera opciÃ³n siguiendo el hilo de propuestas de Carlos  
consiste
a muestrear el conjunto de entrenamiento, y llevar un anÃ¡lisis de  
variaciÃ³n
de los resultados versus el tamaÃ±o muestral.
Pues, es muy probable que no necesites toda la informaciÃ³n para  
conseguir un
modelo de predicciÃ³n fiable.
Este de tipo de situaciÃ³n es parte de la vocaciÃ³n del muestreo.

Un saludo. Olivier
-- ____________________________________

Olivier G. NuÃ±ez
Email: onunez en iberstat.es
Tel : +34 663 03 69 09
Web: http://www.iberstat.es

____________________________________

El 18/03/2010, a las 11:22, VÃctor RodrÃguez Galiano escribiÃ³:

El error se produce al hacer la predicciÃ³n.

Date: Thu, 18 Mar 2010 11:19:12 +0100
Subject: Re: [R-es] aumentar tamaÃ±o de memoria a mas de 4Gb?
From: cgb en datanalytics.com
To: luxorvrg en hotmail.com
CC: r-help-es en r-project.org

Â¿Sabes exactamente dÃ³nde se produce el error? Â¿Es al construir  
el
modelo o al hacer la predicciÃ³n?

Un saludo,

Carlos J. Gil Bellosta
http://www.datanalytics.com

El dÃa 18 de marzo de 2010 10:43, VÃctor RodrÃguez Galiano
<luxorvrg en hotmail.com> escribiÃ³:

Hola de nuevo,

Esta es la informaciÃ³n de mi sesion:

R version 2.10.1 (2009-12-14)
i386-pc-mingw32

locale:
[1] LC_COLLATE=Spanish_Spain.1252  LC_CTYPE=Spanish_Spain.1252
[3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
[5] LC_TIME=Spanish_Spain.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods    
base

Lo que yo prentendo es clasificar unos datos que estÃ¡n en  
ficheros de
texto, el tamaÃ±o de cada uno de estos ficheros va desde los  
10mb a 90mb. En
un fichero de texto (input) le indico los ficheros que tiene  
que ir cogiendo
para predecir y en el archivo output los nombres de los  
ficheros de salida
con las clasificaciones. Como os comentaba antes, con el  
clasificador
randomForest no tengo problema pero si con support vector.  
Cuando trabajo
con los datos de entrenamiento y test no hay ningÃºn problema. 
El problema
surge al intentar clasificar nuevos ejemplos.

# R script for running random forest classification model and  
prediction
for many segments/areas
# Need to run calibration only once for full model and then run
prediction in a loop for different segments/areas/regions

################################################################## 
#################################################
# Part 1: calibration

library(e1071)

#calibration step
calibrate<-read.table("calibration.txt", header=TRUE)
calibrate$calibration<-as.factor(calibrate$calibration)

calibrate.rf<-svm(calibration~B1+B14+B15+B16+B17+B18+B19+B20+B21 
+B24+B25+B26+B51+B52+B53+B54+B55+B56+B57+B58+B59+B60+B61+B62,
data=calibrate, cost=6.8, gamma=0.08)

################################################################## 
##################################################

################################################################## 
##################################################
# Part 2: Automated Prediction

#R automated prediction step for support vector
#Note: first you need to calibrate the model separately and  
then run
this script for different image segments/areas
#Note: this script requires two input text files called  
input.txt and
output.txt
#The first line of input.txt gives the header, the second line the
number of input segments (eg. bands and elevation values) and  
then the later
lines list the names of the input segments with txt extension
#The first line of output.txt gives the header, the second line  
the
number of output segments which is predicted by the classifier  
and then the
later lines list the names of the output predicted segments  
with txt
extension

# reading the parameter files
input<-read.table("input.txt", header=TRUE)
output<-read.table("output.txt", header=TRUE)

# no_elements for 1 and 2 should be the same
no_elements1<-as.integer(toString(input$para1[1]))
no_elements2<-as.integer(toString(input$para2[1]))

# increasing the memory limit to 4 MB
memory.limit(size=4000)

for (i in 1:no_elements1) {
 input_name<-toString(input$para1[i+1])
 predict<-read.table(input_name, header=TRUE)
 predValues<-predict(calibrate.rf, predict)
 predValues<-as.numeric(predValues)
 output_name<-toString(output$para2[i+1])
 write.table(predValues, output_name, row.names=FALSE,
col.names=output_name)
}

Lo que muestra la funcion str() es lo siguiente:

str(output)

'data.frame':   2 obs. of  1 variable:
 $ para2: Factor w/ 2 levels "1","2H_map.txt": 1 2

str(input)

'data.frame':   2 obs. of  1 variable:
 $ para1: Factor w/ 2 levels "1","2H.txt": 1 2

_________________________________________________________________
Â¿Te gustarÃa tener Hotmail en tu mÃ³vil Movistar? Â¡Es gratis!
http://serviciosmoviles.es.msn.com/hotmail/movistar- 
particulares.aspx
      [[alternative HTML version deleted]]

_______________________________________________
R-help-es mailing list
R-help-es en r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

_________________________________________________________________
Ahora Messenger en tu BlackberryÂ® 8520 con Movistar por 0 ?.  
Â¿A quÃ©
esperas?
http://serviciosmoviles.es.msn.com/messenger/blackberry.aspx
       [[alternative HTML version deleted]]

_______________________________________________
R-help-es mailing list
R-help-es en r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

_______________________________________________
R-help-es mailing list
R-help-es en r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

An embedded and charset-unspecified text was scrubbed...
Name: no disponible
URL: <https://stat.ethz.ch/pipermail/r-help-es/attachments/20100318/190c2231/attachment.pl>