Skip to content
Back to formatted view

Raw Message

Message-ID: <f75f5dc7-06ae-65ae-b02c-2eea745e101c@gmail.com>
Date: 2017-07-10T16:03:04Z
From: Duncan Murdoch
Subject: [New Patch] Fix disk corruption when writing
In-Reply-To: <92cd247a-52d7-d862-f5aa-509a62cc46a1@insa-toulouse.fr>

On 10/07/2017 9:00 AM, Serguei Sokol wrote:
> Le 10/07/2017 ? 13:13, Duncan Murdoch a ?crit :
>> On 10/07/2017 5:34 AM, Serguei Sokol wrote:
>>> Le 10/07/2017 ? 11:19, Duncan Murdoch a ?crit :
>>>> On 10/07/2017 4:54 AM, Serguei Sokol wrote:
>>>>> Le 08/07/2017 ? 00:54, Duncan Murdoch a ?crit :
>>>>>> I have now committed changes to R-devel (rev 72898) that seem to catch large and small errors.  They only give a warning if the error happens when the
>>>>>> connection is closed, because that can happen asynchronously
>>>>> For this asynchronous behavior, would not it be more useful to have
>>>>> the name of the file that failed at closing? If many files were open
>>>>> during a session and not closed explicitly (yes, bad practice but it
>>>>> can happen), the warning message doesn't help to understand
>>>>> which of files were corrupted, e.g.:
>>>>>  > fc=file("/dev/full", "w")
>>>>>  > write.csv("a", file=fc)
>>>>>  > q("yes")
>>>>> Warning message:
>>>>> In close.connection(getConnection(set[i])) :
>>>>>    Problem closing connection:  No space left on device
>>>>>
>>>>> Having only "set[i]" for indication is not very informative, is it?
>>>>
>>>> To debug your failure to close fc, reproduce the conditions before the warning was issued, and call showConnections().
>>> It can help in some cases but in all.
>>> First, to reproduce the exact condition of failure is not always possible. It could
>>> happen after a long calculation and the environment that caused
>>> the failure could evolve meantime. And second, having the list of
>>> connections still does not say which one (or many) has/have failed as
>>> we have only "set[i]" not even the connection number (which in turn
>>> could be not the same between the first failure and a tentative to reproduce it).
>>>
>>> Is adding con->description to the warning message problematic in any sens ?
>>
>> Yes, we don't know if it is still valid after the connection has been closed.  It's just a pointer, whose target is allocated when the connection is created,
>> and deallocated when it is closed. Using it after closing could lead to a seg fault.
> If you mean "free(con->description);" which is in con_close1() at connections.c:3536
> it occurs after calling checkClose(). Then logically, con-description is still valid
> during generation of warning message.

No, because we don't know what happened in the con->close() function. 
It may have set the description to NULL.  Or it may have been NULL from 
the beginning.

Obviously this obscure aspect of poor programming behaviour could get a 
better diagnostic message, but it's not worth my time to spend any more 
time on it.  I'd rather spend time on things that actually matter.

Duncan Murdoch