Hello,
I'm looking for interest in this functionality and eventually working
it into a package.
I don't actually use LAS data much, and only have one example file so
I'm hoping others who do or know others who would be interested can
help. I have previously posted this to r-spatial-devel.
R's support for the "raw" type provides a very neat way of reading
data from binary files, such as LAS files. These files have a header
containing metadata that is used to specify reading from the bulk data
in the file. Essentially, you can read large sets of records at once
into a raw matrix with readBin and then use the indexing tools to
extract the relevant bytes for each element of the record. This means
that the bulk read is fast, and processing is also vectorized as the
raw matrix can then be read in index pieces by readBin.
I've done this in a simplistic way in the attached functions
o "publicHeaderDescription" - function returns a list generated to hold
information about the header
o "readLAS" - read file function
It works for a LAS 1.0 file that I have (see lasinfo summary below)
and returns a matrix of record values. Certainly it works fast and
efficiently enough for smallish files. I really don't have experience
with the likely range of sizes of these files.
The remaining work I see involves:
- write as a "chunked" read so large files can be processed in parts,
not as one big slurp
- return only desired record values, and / or subset of records
- extract all components of the records, and generalize for LAS 1.1, 1.2
- re-organize the header read and arrangement (what I've done was
easy enough for me to understand, but it's a bit of a mess)
(- no doubt many more general improvements I haven't thought of)
To use:
source("http://staff.acecrc.org.au/~mdsumner/las/readLAS.R")
## I cannot provide LAS data, this is just an illustration
f <- "lfile.las"
## [1] 53.83393 Mb
file.info(f)$size/1e6
system.time({
d <- readLAS(f)
})
# user system elapsed
# 1.01 0.21 1.21
dim(d)
# [1] 1922632 5
colnames(d)
#[1] "x" "y" "z" "gpstime" "intensity"
Optionally the list of header components (some extracted as actual
data, some left in their "raw" form) can be returned instead of data
using "returnHeaderOnly=TRUE".
###
## example: simple plot with rgl
library(rgl)
scl <- function(x) (x - min(x))/diff(range(x))
plot3d(d[,1:3], col = topo.colors(256)[scl(d[,5]) * 256])
I hope this is of interest.
Cheers, Mike.
lasinfo output on "lfile.las"
---------------------------------------------------------
Header Summary
---------------------------------------------------------
File Name: lfile.las
Version: 1.0
Source ID: 0
Reserved: 0
Project ID/GUID: '00000000-0000-0000-0000-000000000000'
System Identifier: ''
Generating Software: 'TerraScan'
File Creation Day/Year: 0/0
Header Size 227
Offset to Point Data 229
Number Var. Length Records 0
Point Data Format 1
Point Data Record Length 28
Number of Point Records 1922632
Number of Points by Return 1572923 324305 24659 740 5
Scale Factor X Y Z 0.01 0.01 0.01
## etc...
read function for LAS data
11 messages · Michael Sumner, Alex Mandel, Etienne Bellemare Racine +1 more
On 06/03/2010 07:54 PM, Michael Sumner wrote:
Hello, I'm looking for interest in this functionality and eventually working it into a package. I don't actually use LAS data much, and only have one example file so I'm hoping others who do or know others who would be interested can help. I have previously posted this to r-spatial-devel.
I think there are people who would use it. You might want to have a look at http://liblas.org/ (some of the same people that do gdal/org work) Wrapping this library might be a good approach. There are example files available too. Thanks, Alex
Thanks Alex, I will eventually post this to a broader audience. I've used liblas and lastools, but the aim here is for a pure R implementation that is built directly from the LAS specification without 3rd party tools. The R code already works quite well to extract x/y/z/time/intensity, it just needs some extra work to tidy up and generalize things and ensure that very big datasets can be read. Cheers, Mike.
On Sat, Jun 5, 2010 at 6:07 AM, Alex Mandel <tech_dev at wildintellect.com> wrote:
On 06/03/2010 07:54 PM, Michael Sumner wrote:
Hello, I'm looking for interest in this functionality and eventually working it into a package. I don't actually use LAS data much, and only have one example file so I'm hoping others who do or know others who would be interested can help. I have previously posted this to r-spatial-devel.
I think there are people who would use it. You might want to have a look at http://liblas.org/ (some of the same people that do gdal/org work) Wrapping this library might be a good approach. There are example files available too. Thanks, Alex
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
1 day later
This is interesting, I'll try your code on my lidar files in the next few days. 2010-06-04 22:36, Michael Sumner wrote :
Thanks Alex, I will eventually post this to a broader audience. I've used liblas and lastools, but the aim here is for a pure R implementation that is built directly from the LAS specification without 3rd party tools.
What might be of interest in using liblas is that it provides support for many las versions and they plan to provide support for some versions to come (conditional to funding) so having an R binding might be of interest here. They are also working on the integration of a spatial index which would allow easier handling of large files. I must say I don't know how hard writing a wrapper for R might be for that particular tool. To other issue I see here is that R is loading the whole file in memory, so if you can manage small files, that might not be that easy with (standard) larger ones. Don't you think ? Did you give a try to the R SAGA package. There is a module for loading las files but again, I don't know how it manages memory. I guess that it could be possible to use some sort of ff package to handle bigger files, but that's just on the top of my head. Etienne
The R code already works quite well to extract x/y/z/time/intensity, it just needs some extra work to tidy up and generalize things and ensure that very big datasets can be read. Cheers, Mike. On Sat, Jun 5, 2010 at 6:07 AM, Alex Mandel<tech_dev at wildintellect.com> wrote:
On 06/03/2010 07:54 PM, Michael Sumner wrote:
Hello,
I'm looking for interest in this functionality and eventually working
it into a package.
I don't actually use LAS data much, and only have one example file so
I'm hoping others who do or know others who would be interested can
help. I have previously posted this to r-spatial-devel.
I think there are people who would use it. You might want to have a look at http://liblas.org/ (some of the same people that do gdal/org work) Wrapping this library might be a good approach. There are example files available too. Thanks, Alex
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Hello,
To other issue I see here is that R is loading the whole file in memory, so if you can manage small files, that might not be that easy with (standard) larger ones. Don't you think ?
That was certainly true of the version of code I posted, but writing a
more flexible version is not difficult, and actually less difficult
than I expected. I've implemented arguments to "skip" and read "nrows"
at a time, so there is the beginnings of a wrapper around the core
read for building more flexibility.
(I was thinking of including subsetting of various kinds which really
makes it more complicated, and the appropriate level to handle that is
in a wrapper to this function).
I've updated the R source on my site, and here's a new example. This
should be considered as a rough working draft, the details can be
hidden in the final suite of functions. My chunk/rows handling is
pretty awkward, and may have bugs for particular record numbers.
Any testing you can provide would be greatly appreciated.
# new version with "skip" and "nrows" arguments
source("http://staff.acecrc.org.au/~mdsumner/las/readLAS.R")
f <- "lfile.las"
## get just the header
hd <- readLAS(f, returnHeaderOnly = TRUE)
numrows <- hd$`Number of point records`
## [1] 1922632
## read in chunks, and pass to DB or ff, or subset by sampling, etc..
rowskip <- 0
chunk <- 1e5
rowsleft <- numrows
system.time({
## keep track of how many rows we skip, and how many are left
for (i in 1:ceiling(numrows / chunk)) {
if (rowsleft < chunk) chunk <- rowsleft
if (chunk < 1) break;
d <- readLAS(f, skip = rowskip, nrows = chunk)
rowskip <- rowskip + nn
rowsleft <- numrows - nn
}
})
# user system elapsed
# 1.10 0.55 1.64
On Sun, Jun 6, 2010 at 8:09 PM, Etienne Bellemare Racine
<etiennebr at gmail.com> wrote:
This is interesting, I'll try your code on my lidar files in the next few days. ?2010-06-04 22:36, Michael Sumner wrote :
Thanks Alex, I will eventually post this to a broader audience. I've used liblas and lastools, but the aim here is for a pure R implementation that is built directly from the LAS specification without 3rd party tools.
What might be of interest in using liblas is that it provides support for many las versions and they plan to provide support for some versions to come (conditional to funding) so having an R binding might be of interest here. They are also working on the integration of a spatial index which would allow easier handling of large files. I must say I don't know how hard writing a wrapper for R might be for that particular tool. To other issue I see here is that R is loading the whole file in memory, so if you can manage small files, that might not be that easy with (standard) larger ones. Don't you think ? Did you give a try to the R SAGA package. There is a module for loading las files but again, I don't know how it manages memory. I guess that it could be possible to use some sort of ff package to handle bigger files, but that's just on the top of my head. Etienne
The R code already works quite well to extract x/y/z/time/intensity, it just needs some extra work to tidy up and generalize things and ensure that very big datasets can be read. Cheers, Mike. On Sat, Jun 5, 2010 at 6:07 AM, Alex Mandel<tech_dev at wildintellect.com> ?wrote:
On 06/03/2010 07:54 PM, Michael Sumner wrote:
Hello, I'm looking for interest in this functionality and eventually working it into a package. I don't actually use LAS data much, and only have one example file so I'm hoping others who do or know others who would be interested can help. I have previously posted this to r-spatial-devel.
I think there are people who would use it. You might want to have a look at http://liblas.org/ (some of the same people that do gdal/org work) Wrapping this library might be a good approach. There are example files available too. Thanks, Alex
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
1 day later
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-geo/attachments/20100607/7127dac6/attachment.pl>
On Jun 6, 2010, at 5:09 AM, Etienne Bellemare Racine wrote:
This is interesting, I'll try your code on my lidar files in the next few days. 2010-06-04 22:36, Michael Sumner wrote :
Thanks Alex, I will eventually post this to a broader audience. I've used liblas and lastools, but the aim here is for a pure R implementation that is built directly from the LAS specification without 3rd party tools.
What might be of interest in using liblas is that it provides support for many las versions and they plan to provide support for some versions to come (conditional to funding) so having an R binding might be of interest here. They are also working on the integration of a spatial index which would allow easier handling of large files. I must say I don't know how hard writing a wrapper for R might be for that particular tool.
and controllable point caching, reprojection on the fly (when linked w/ GDAL), coordinate system description, and LAS 1.0-1.3 (no waveform for 1.3 though) support. I would note that contrary to what the LAS specification says, there are many softwares out in the wild that don't write LAS files properly. This causes big hassles, but it's libLAS' job to ensure that it can read everything, so we bend over backwards to ensure things work while doing our best to write valid files. I maintain an LAS sample library at http://liblas.org/samples . They vary based on format types and the softwares that wrote them as this is what is most interesting to libLAS. They are free for non-commercial usages. I understand the desire to not have external dependencies, which is the impetus for efforts like this. It is unfortunate that distribution and platform issues make duplication of efforts like these economical in the short term. Howard
Thanks Etienne. On Tue, Jun 8, 2010 at 12:31 AM, Etienne Bellemare Racine
<etiennebr at gmail.com> wrote:
Michael,
I have not gone through extensive testing, but it seems pretty fast and
usefull for my 1.0 las. I've loaded a 235 Mb file on a USB drive, and it ran
in ~20 seconds. SAGA, in comparison did it in 3 min 22 seconds.
Cheers,
Etienne
Le 2010-06-06 10:18, Michael Sumner a ?crit?:
Hello,
To other issue I see here is that R is loading the whole file in memory, so
if you can manage small files, that might not be that easy with (standard)
larger ones. Don't you think ?
That was certainly true of the version of code I posted, but writing a
more flexible version is not difficult, and actually less difficult
than I expected. I've implemented arguments to "skip" and read "nrows"
at a time, so there is the beginnings of a wrapper around the core
read for building more flexibility.
(I was thinking of including subsetting of various kinds which really
makes it more complicated, and the appropriate level to handle that is
in a wrapper to this function).
I've updated the R source on my site, and here's a new example. This
should be considered as a rough working draft, the details can be
hidden in the final suite of functions. My chunk/rows handling is
pretty awkward, and may have bugs for particular record numbers.
Any testing you can provide would be greatly appreciated.
# new version with "skip" and "nrows" arguments
source("http://staff.acecrc.org.au/~mdsumner/las/readLAS.R")
f <- "lfile.las"
## get just the header
hd <- readLAS(f, returnHeaderOnly = TRUE)
numrows <- hd$`Number of point records`
## [1] 1922632
## read in chunks, and pass to DB or ff, or subset by sampling, etc..
rowskip <- 0
chunk <- 1e5
rowsleft <- numrows
system.time({
## keep track of how many rows we skip, and how many are left
for (i in 1:ceiling(numrows / chunk)) {
if (rowsleft < chunk) chunk <- rowsleft
if (chunk < 1) break;
d <- readLAS(f, skip = rowskip, nrows = chunk)
rowskip <- rowskip + nn
rowsleft <- numrows - nn
}
})
# user system elapsed
# 1.10 0.55 1.64
On Sun, Jun 6, 2010 at 8:09 PM, Etienne Bellemare Racine
<etiennebr at gmail.com> wrote:
This is interesting, I'll try your code on my lidar files in the next few
days.
?2010-06-04 22:36, Michael Sumner wrote :
Thanks Alex, I will eventually post this to a broader audience.
I've used liblas and lastools, but the aim here is for a pure R
implementation that is built directly from the LAS specification
without 3rd party tools.
What might be of interest in using liblas is that it provides support for
many las versions and they plan to provide support for some versions to come
(conditional to funding) so having an R binding might be of interest here.
They are also working on the integration of a spatial index which would
allow easier handling of large files. I must say I don't know how hard
writing a wrapper for R might be for that particular tool.
To other issue I see here is that R is loading the whole file in memory, so
if you can manage small files, that might not be that easy with (standard)
larger ones. Don't you think ? Did you give a try to the R SAGA package.
There is a module for loading las files but again, I don't know how it
manages memory. I guess that it could be possible to use some sort of ff
package to handle bigger files, but that's just on the top of my head.
Etienne
The R code already works quite well to extract x/y/z/time/intensity,
it just needs some extra work to tidy up and generalize things and
ensure that very big datasets can be read.
Cheers, Mike.
On Sat, Jun 5, 2010 at 6:07 AM, Alex Mandel<tech_dev at wildintellect.com>
?wrote:
On 06/03/2010 07:54 PM, Michael Sumner wrote:
Hello,
I'm looking for interest in this functionality and eventually working
it into a package.
I don't actually use LAS data much, and only have one example file so
I'm hoping others who do or know others who would be interested can
help. I have previously posted this to r-spatial-devel.
I think there are people who would use it. You might want to have a look
at http://liblas.org/ (some of the same people that do gdal/org work)
Wrapping this library might be a good approach. There are example files
available too.
Thanks,
Alex
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo _______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo _______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
Thanks Howard, do you know of people working with LAS data in R, or any efforts to link to liblas? I'd be keen to get involved eventually, but it's not an area I have particular strengths. I've noticed the "R" data format driver in GDAL (for reading/writing simple arrays in R's save() format), so there must be some in that community using R in a fairly intimate way. BTW, I don't want this example to be seen as some kind of snub to the efforts of projects like liblas, I'm really interested in the fact that it *can* be done - not necessarily pushing that it should be done. The impetus here was really my interest in alternative methods, not just avoiding dependencies. Tools like this can be helpful for basic data rescue when a full solution is not at hand. There are endless varieties of raw binary formats out there and very few have a commited team like liblas or GDAL working to help. Cheers, Mike.
On Tue, Jun 8, 2010 at 6:17 AM, Howard Butler <hobu.inc at gmail.com> wrote:
On Jun 6, 2010, at 5:09 AM, Etienne Bellemare Racine wrote:
This is interesting, I'll try your code on my lidar files in the next few days. 2010-06-04 22:36, Michael Sumner wrote :
Thanks Alex, I will eventually post this to a broader audience. I've used liblas and lastools, but the aim here is for a pure R implementation that is built directly from the LAS specification without 3rd party tools.
What might be of interest in using liblas is that it provides support for many las versions and they plan to provide support for some versions to come (conditional to funding) so having an R binding might be of interest here. They are also working on the integration of a spatial index which would allow easier handling of large files. I must say I don't know how hard writing a wrapper for R might be for that particular tool.
and controllable point caching, reprojection on the fly (when linked w/ GDAL), coordinate system description, and LAS 1.0-1.3 (no waveform for 1.3 though) support. ?I would note that contrary to what the LAS specification says, there are many softwares out in the wild that don't write LAS files properly. ?This causes big hassles, but it's libLAS' job to ensure that it can read everything, so we bend over backwards to ensure things work while doing our best to write valid files. I maintain an LAS sample library at http://liblas.org/samples . ?They vary based on format types and the softwares that wrote them as this is what is most interesting to libLAS. ?They are free for non-commercial usages. I understand the desire to not have external dependencies, which is the impetus for efforts like this. ?It is unfortunate that distribution and platform issues make duplication of efforts like these economical in the short term. Howard
_______________________________________________ R-sig-Geo mailing list R-sig-Geo at stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-sig-geo
On Jun 7, 2010, at 7:41 PM, Michael Sumner wrote:
BTW, I don't want this example to be seen as some kind of snub to the efforts of projects like liblas, I'm really interested in the fact that it *can* be done - not necessarily pushing that it should be done. The impetus here was really my interest in alternative methods, not just avoiding dependencies.
I didn't take it as a snub, but rather I was firing back with a warning at how deep this LAS rabbit hole actually goes :) After first reading the spec, I thought, "oh a nice simple binary format, I can write code for that." It seems everyone else thought the same as well, and we all came back with different answers. Over time, I hope the bigger vendors pick up libLAS internally (one or two large commercial vendors and a number of mid-level ones now use libLAS), so the variations in data start to go away. I've given them every possible reason to take it up too -- BSD license, loads of features, multiple APIs, multiple platforms, not slow.
Thanks Howard, I appreciate the perspective.
"oh a nice simple binary format, I can write code for that."
Heh - yes indeed this is an attractive pit. Cheers, Mike.
On Tue, Jun 8, 2010 at 11:51 AM, Howard Butler <hobu.inc at gmail.com> wrote:
On Jun 7, 2010, at 7:41 PM, Michael Sumner wrote:
BTW, I don't want this example to be seen as some kind of snub to the efforts of projects like liblas, I'm really interested in the fact that it *can* be done - not necessarily pushing that it should be done. The impetus here was really my interest in alternative methods, not just avoiding dependencies.
I didn't take it as a snub, but rather I was firing back with a warning at how deep this LAS rabbit hole actually goes :) ?After first reading the spec, I thought, "oh a nice simple binary format, I can write code for that." ? ?It seems everyone else thought the same as well, and we all came back with different answers. ?Over time, I hope the bigger vendors pick up libLAS internally (one or two large commercial vendors and a number of mid-level ones now use libLAS), so the variations in data start to go away. ?I've given them every possible reason to take it up too -- BSD license, loads of features, multiple APIs, multiple platforms, not slow.