clustering multi band images

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-geo/attachments/20080612/21b9d382/attachment.pl>

Dear list,
I am trying to do some clustering on images. And I have two main problems:

1) Clustering multiband images.
I managed to be successful with a single band image, but when trying to
apply to a 3 band I get the following warning message:
In as.matrix.SpatialGridDataFrame(x) :
 as.matrix.SpatialPixelsDataFrame uses first column;
pass subset or [] for other columns

2) saving clustering results as grid or image.
I get a vector of clusters, but without both coordinates. How it is possible
to transform it in a grid?

Here the code I use to read the image itself and to do the clustering:

library(rgdal)
fld <- system.file("E:/data/IMG/fr/", package="rgdal")
img <- readGDAL("123_rawR.tif")

kl <- kmeans(img, 5)
img is a SpatialGridDataFrame. kmeans() wants a matrix or data frame, so 
say:

kl <- kmeans(as(img, "data.frame"), 5)

Then

img$cluster <- kl$cluster

image(img, "cluster")

Does this help?

Roger
I am quite new to image processing, especially within R, and any help is
greatly appreciated.

Thank you in advance

LP

	[[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-geo/attachments/20080612/777f8599/attachment.pl>
However the results are quite different from what expected.
Then I am wondering if is it that the correct way to handle clustering of
remote sensing images.

Thanks

LP

Laura, it's hard for us to look over your shoulder, and say something 
useful, we also don't know your expectations.

Please remember that clustering is done only on the pixel values, and 
ignores the spatial ordering of those pixels. If there's a bit of noise 
in the image, you might find quite a bit of noise in the resulting clusters.
--
Edzer
If your images are large (and images typically are large because pixel size
has to be small compared to the extent of the image for the image to
be of acceptable quality for our vision system), I do not advice you
to get them into R for processing as R has severe memory limits
and many classification techniques are not precisely memory-efficient
(but see clara() in package cluster, actually read 
http://cran.r-project.org/web/views/Cluster.html).

I think that you should sample your image in a RS/GIS environment making 
sure you cover all
the radiometric space and import only a table pixels x bands into R, the 
actual nb. of pixels depending on your HW/SW configuration (but 10000 
would be a good start). Then use the numerous R classification tools to 
define the centroids and once you have them use again your RS/GIS 
program to actually assign each pixel in the image to a centroid 
according to a given rule (i.e. maximum likelihood). There might be
ways of writing an efficient assignation step within R itself also, I 
think that mclust package does it.

Another way of reducing the number of individuals to classify is 
performing a segmentation of the image first and then classify segments
instead of pixels (i.e.
# Lobo, A. 1997.  Image segmentation and discriminant analysis for the 
identification of land cover units in Ecology. IEEE Transactions on 
Geoscience and Remote Sensing, 35(5): 1- 11.
http://wija.ija.csic.es/gt/obster/ABSTRACTS/alobo_ieee97.pdf
perhaps other articles in 
http://wija.ija.csic.es/gt/obster/alobo_publis.html
might be of help)

In any case, note that img in your code should be converted into
a multivariate table pixels x bands for most classification
functions in R to work. Note that this fact makes obvious
that classification approaches to image processing do not make
use of the spatial information of the image, which is actually
a fundamental part of the information of any image.

Agus

Laura Poggio escribi?:
Dear list,
I am trying to do some clustering on images. And I have two main problems:

1) Clustering multiband images.
I managed to be successful with a single band image, but when trying to
apply to a 3 band I get the following warning message:
In as.matrix.SpatialGridDataFrame(x) :
  as.matrix.SpatialPixelsDataFrame uses first column;
 pass subset or [] for other columns

2) saving clustering results as grid or image.
I get a vector of clusters, but without both coordinates. How it is possible
to transform it in a grid?

Here the code I use to read the image itself and to do the clustering:

library(rgdal)
fld <- system.file("E:/data/IMG/fr/", package="rgdal")
img <- readGDAL("123_rawR.tif")

kl <- kmeans(img, 5)

I am quite new to image processing, especially within R, and any help is
greatly appreciated.

Thank you in advance

LP

	[[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Dr. Agustin Lobo
Institut de Ciencies de la Terra "Jaume Almera" (CSIC)
LLuis Sole Sabaris s/n
08028 Barcelona
Spain
Tel. 34 934095410
Fax. 34 934110012
email: Agustin.Lobo at ija.csic.es
http://www.ija.csic.es/gt/obster
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-geo/attachments/20080612/76cf56f2/attachment.pl>
Laura,

Laura Poggio escribi?:
Thank you very much for your detailed answer that made me understand a 
lot, and also it pointed out what I was thinking: R does not use the 
spatial information for classification.
Hep! this is not a problem of R, don't blame it for that. R is wonderful
for multi-variate classification. This is a problem of the whole 
approach of applying multi-variate classification to multi-spectral 
imagery. And this does not mean that the approach is wrong or useless,
it's just a warning, a fact that the analyst must keep in mind.

Agus
The image (for the moment) is rather small, as it is a sample of 512x512 
pixels. I have to compare the effect of a segmentation method over raw 
data for various unsupervised techniques.
My idea was to do the classification in R, because it handles many more 
methods then GIS/RS software I have available.

I will investigate some of the points raised and in case I will come 
back with more clear ideas and questions.

Thank you very much to everybody for the support.

Laura

2008/6/12 Agustin Lobo <Agustin.Lobo at ija.csic.es 
<mailto:Agustin.Lobo at ija.csic.es>>:

    If your images are large (and images typically are large because
    pixel size
    has to be small compared to the extent of the image for the image to
    be of acceptable quality for our vision system), I do not advice you
    to get them into R for processing as R has severe memory limits
    and many classification techniques are not precisely memory-efficient
    (but see clara() in package cluster, actually read
    http://cran.r-project.org/web/views/Cluster.html).

    I think that you should sample your image in a RS/GIS environment
    making sure you cover all
    the radiometric space and import only a table pixels x bands into R,
    the actual nb. of pixels depending on your HW/SW configuration (but
    10000 would be a good start). Then use the numerous R classification
    tools to define the centroids and once you have them use again your
    RS/GIS program to actually assign each pixel in the image to a
    centroid according to a given rule (i.e. maximum likelihood). There
    might be
    ways of writing an efficient assignation step within R itself also,
    I think that mclust package does it.

    Another way of reducing the number of individuals to classify is
    performing a segmentation of the image first and then classify segments
    instead of pixels (i.e.
    # Lobo, A. 1997.  Image segmentation and discriminant analysis for
    the identification of land cover units in Ecology. IEEE Transactions
    on Geoscience and Remote Sensing, 35(5): 1- 11.
    http://wija.ija.csic.es/gt/obster/ABSTRACTS/alobo_ieee97.pdf
    perhaps other articles in
    http://wija.ija.csic.es/gt/obster/alobo_publis.html
    might be of help)

    In any case, note that img in your code should be converted into
    a multivariate table pixels x bands for most classification
    functions in R to work. Note that this fact makes obvious
    that classification approaches to image processing do not make
    use of the spatial information of the image, which is actually
    a fundamental part of the information of any image.

    Agus

    Laura Poggio escribi?:

        Dear list,
        I am trying to do some clustering on images. And I have two main
        problems:

        1) Clustering multiband images.
        I managed to be successful with a single band image, but when
        trying to
        apply to a 3 band I get the following warning message:
        In as.matrix.SpatialGridDataFrame(x) :
         as.matrix.SpatialPixelsDataFrame uses first column;
         pass subset or [] for other columns

        2) saving clustering results as grid or image.
        I get a vector of clusters, but without both coordinates. How it
        is possible
        to transform it in a grid?

        Here the code I use to read the image itself and to do the
        clustering:

        library(rgdal)
        fld <- system.file("E:/data/IMG/fr/", package="rgdal")
        img <- readGDAL("123_rawR.tif")

        kl <- kmeans(img, 5)

        I am quite new to image processing, especially within R, and any
        help is
        greatly appreciated.

        Thank you in advance

        LP

               [[alternative HTML version deleted]]

        _______________________________________________
        R-sig-Geo mailing list
        R-sig-Geo at stat.math.ethz.ch <mailto:R-sig-Geo at stat.math.ethz.ch>
        https://stat.ethz.ch/mailman/listinfo/r-sig-geo

    -- 
    Dr. Agustin Lobo
    Institut de Ciencies de la Terra "Jaume Almera" (CSIC)
    LLuis Sole Sabaris s/n
    08028 Barcelona
    Spain
    Tel. 34 934095410
    Fax. 34 934110012
    email: Agustin.Lobo at ija.csic.es <mailto:Agustin.Lobo at ija.csic.es>
    http://www.ija.csic.es/gt/obster

Dr. Agustin Lobo
Institut de Ciencies de la Terra "Jaume Almera" (CSIC)
LLuis Sole Sabaris s/n
08028 Barcelona
Spain
Tel. 34 934095410
Fax. 34 934110012
email: Agustin.Lobo at ija.csic.es
http://www.ija.csic.es/gt/obster
(sorry I pressed the send button instead of the save as draft button,
I go on with my comments)

Laura,

Laura Poggio escribi?:
Thank you very much for your detailed answer that made me understand a 
lot, and also it pointed out what I was thinking: R does not use the 
spatial information for classification.
Hep! this is not a problem of R, don't blame it for that. R is wonderful
for multi-variate classification. This is a problem of the whole
approach of applying multi-variate classification to multi-spectral
imagery. And this does not mean that the approach is wrong or useless,
it's just a warning, a fact that the analyst must keep in mind.
The image (for the moment) is rather small, as it is a sample of 512x512 
pixels.
As you have 3 bands the total dimensionality is 512x512x3, which might 
be ok, it depends on the ram you have. 512x512 is rather small for
imagery these days... (unless you had hyperspectral images!).

You should take advantage of the relatively small size of your image to
compare to results using an increasing nb. of sampled pixels. If you
use model-based clustering, I would say that results using 10000 pixels
(covering the whole radiometric space, this is an important caution)
would yield the same results than using all the 512x512 pixels.

 > I have to compare the effect of a segmentation method over raw
 > data for various unsupervised techniques.

Segmentation is not only meant for reducing the memory problems, this
is just a fortunate side effect. Segmentation has many other advantages 
(and some disadvantages).
My idea was to do the classification in R, because it handles many more 
methods then GIS/RS software I have available.
And I agree with you. By using R you get free of all the many 
constraints of classification methods that are implemented in RS 
packages, and you can
experiment with many more different methods. And you do know what yo do.
I was mentioning the use of RS/GIS for sampling and assigning if you had 
large images (yours are exceptionally small nowadays).

Good luck!

Agus
I will investigate some of the points raised and in case I will come 
back with more clear ideas and questions.

Thank you very much to everybody for the support.

Laura

2008/6/12 Agustin Lobo <Agustin.Lobo at ija.csic.es 
<mailto:Agustin.Lobo at ija.csic.es>>:

    If your images are large (and images typically are large because
    pixel size
    has to be small compared to the extent of the image for the image to
    be of acceptable quality for our vision system), I do not advice you
    to get them into R for processing as R has severe memory limits
    and many classification techniques are not precisely memory-efficient
    (but see clara() in package cluster, actually read
    http://cran.r-project.org/web/views/Cluster.html).

    I think that you should sample your image in a RS/GIS environment
    making sure you cover all
    the radiometric space and import only a table pixels x bands into R,
    the actual nb. of pixels depending on your HW/SW configuration (but
    10000 would be a good start). Then use the numerous R classification
    tools to define the centroids and once you have them use again your
    RS/GIS program to actually assign each pixel in the image to a
    centroid according to a given rule (i.e. maximum likelihood). There
    might be
    ways of writing an efficient assignation step within R itself also,
    I think that mclust package does it.

    Another way of reducing the number of individuals to classify is
    performing a segmentation of the image first and then classify segments
    instead of pixels (i.e.
    # Lobo, A. 1997.  Image segmentation and discriminant analysis for
    the identification of land cover units in Ecology. IEEE Transactions
    on Geoscience and Remote Sensing, 35(5): 1- 11.
    http://wija.ija.csic.es/gt/obster/ABSTRACTS/alobo_ieee97.pdf
    perhaps other articles in
    http://wija.ija.csic.es/gt/obster/alobo_publis.html
    might be of help)

    In any case, note that img in your code should be converted into
    a multivariate table pixels x bands for most classification
    functions in R to work. Note that this fact makes obvious
    that classification approaches to image processing do not make
    use of the spatial information of the image, which is actually
    a fundamental part of the information of any image.

    Agus

    Laura Poggio escribi?:

        Dear list,
        I am trying to do some clustering on images. And I have two main
        problems:

        1) Clustering multiband images.
        I managed to be successful with a single band image, but when
        trying to
        apply to a 3 band I get the following warning message:
        In as.matrix.SpatialGridDataFrame(x) :
         as.matrix.SpatialPixelsDataFrame uses first column;
         pass subset or [] for other columns

        2) saving clustering results as grid or image.
        I get a vector of clusters, but without both coordinates. How it
        is possible
        to transform it in a grid?

        Here the code I use to read the image itself and to do the
        clustering:

        library(rgdal)
        fld <- system.file("E:/data/IMG/fr/", package="rgdal")
        img <- readGDAL("123_rawR.tif")

        kl <- kmeans(img, 5)

        I am quite new to image processing, especially within R, and any
        help is
        greatly appreciated.

        Thank you in advance

        LP

               [[alternative HTML version deleted]]

        _______________________________________________
        R-sig-Geo mailing list
        R-sig-Geo at stat.math.ethz.ch <mailto:R-sig-Geo at stat.math.ethz.ch>
        https://stat.ethz.ch/mailman/listinfo/r-sig-geo

    -- 
    Dr. Agustin Lobo
    Institut de Ciencies de la Terra "Jaume Almera" (CSIC)
    LLuis Sole Sabaris s/n
    08028 Barcelona
    Spain
    Tel. 34 934095410
    Fax. 34 934110012
    email: Agustin.Lobo at ija.csic.es <mailto:Agustin.Lobo at ija.csic.es>
    http://www.ija.csic.es/gt/obster

Dr. Agustin Lobo
Institut de Ciencies de la Terra "Jaume Almera" (CSIC)
LLuis Sole Sabaris s/n
08028 Barcelona
Spain
Tel. 34 934095410
Fax. 34 934110012
email: Agustin.Lobo at ija.csic.es
http://www.ija.csic.es/gt/obster
On Thu, 12 Jun 2008, Laura Poggio wrote:

Dear list,
I am trying to do some clustering on images. And I have two main 
problems:

1) Clustering multiband images.
I managed to be successful with a single band image, but when trying to
apply to a 3 band I get the following warning message:
In as.matrix.SpatialGridDataFrame(x) :
 as.matrix.SpatialPixelsDataFrame uses first column;
pass subset or [] for other columns

2) saving clustering results as grid or image.
I get a vector of clusters, but without both coordinates. How it is 
possible
to transform it in a grid?

Here the code I use to read the image itself and to do the clustering:

library(rgdal)
fld <- system.file("E:/data/IMG/fr/", package="rgdal")
img <- readGDAL("123_rawR.tif")

kl <- kmeans(img, 5)
img is a SpatialGridDataFrame. kmeans() wants a matrix or data frame, 
so say:

kl <- kmeans(as(img, "data.frame"), 5)
this also passes the coordinates to the clustering routine; I'm not sure 
but I think that was not the initial idea.
In case img is a 3-band image, use

kl <- kmeans(as(img, "data.frame")[1:3], 5)
--
Edzer
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-geo/attachments/20080612/39d42f0e/attachment.pl>

Roger Bivand wrote:
 On Thu, 12 Jun 2008, Laura Poggio wrote:

 Dear list,
 I am trying to do some clustering on images. And I have two main 
 problems:

 1) Clustering multiband images.
 I managed to be successful with a single band image, but when trying to
 apply to a 3 band I get the following warning message:
 In as.matrix.SpatialGridDataFrame(x) :
 as.matrix.SpatialPixelsDataFrame uses first column;
 pass subset or [] for other columns

 2) saving clustering results as grid or image.
 I get a vector of clusters, but without both coordinates. How it is 
 possible
 to transform it in a grid?

 Here the code I use to read the image itself and to do the clustering:

 library(rgdal)
 fld <- system.file("E:/data/IMG/fr/", package="rgdal")
 img <- readGDAL("123_rawR.tif")

 kl <- kmeans(img, 5)
 img is a SpatialGridDataFrame. kmeans() wants a matrix or data frame, so
 say:

 kl <- kmeans(as(img, "data.frame"), 5)
this also passes the coordinates to the clustering routine; I'm not sure but 
I think that was not the initial idea.
In case img is a 3-band image, use

kl <- kmeans(as(img, "data.frame")[1:3], 5)
Right, I replied too fast without chacking, sorry!

Roger
--
Edzer

Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no