close fd while select/poll/epoll

Discussion:

(too old to reply)

shebble

2008-02-03 05:31:28 UTC

On Linux, lets say 2.6 kernel...

Can you close a file descriptor in one thread while another thread is using it
in a select(), poll(), or epoll_wait() ?

What will happen?

thx

p***@ipal.net

2008-02-03 16:29:26 UTC

Permalink

On Sat, 02 Feb 2008 22:31:28 -0700 shebble <***@example.com> wrote:
| On Linux, lets say 2.6 kernel...
|
| Can you close a file descriptor in one thread while another thread is using it
| in a select(), poll(), or epoll_wait() ?
|
| What will happen?

Interesting question. And what if ALL the descriptors being polled are
closed this way. It would seem either the poll() will now wait on nothing
that is there to wait on, and either has to wake up immediately, or has
to wait until the timeout (which could be forever).

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2008-02-03-***@ipal.net |
|------------------------------------/-------------------------------------|

shebble

2008-02-03 21:30:12 UTC

Permalink

Post by p***@ipal.net
| On Linux, lets say 2.6 kernel...
|
| Can you close a file descriptor in one thread while another thread is using it
| in a select(), poll(), or epoll_wait() ?
|
| What will happen?
Interesting question. And what if ALL the descriptors being polled are
closed this way. It would seem either the poll() will now wait on nothing
that is there to wait on, and either has to wake up immediately, or has
to wait until the timeout (which could be forever).

I tried it with epoll :) I tried closing 1 of 2 descriptors, and 2 of 2
descriptors I was waiting on. epoll_wait() does nothing. continues to wait. A
few times under debugger I got it to return with EINTR, but im guessing that was
something to do with the debugger.

And anyways EINTR wouldnt do you much good eh? How would you know that a file
descriptor closed and which one it was.

p***@ipal.net

2008-02-04 13:45:02 UTC

Permalink

On Sun, 03 Feb 2008 14:30:12 -0700 shebble <***@example.com> wrote:
| phil-news-***@ipal.net wrote:
|> On Sat, 02 Feb 2008 22:31:28 -0700 shebble <***@example.com> wrote:
|> | On Linux, lets say 2.6 kernel...
|> |
|> | Can you close a file descriptor in one thread while another thread is using it
|> | in a select(), poll(), or epoll_wait() ?
|> |
|> | What will happen?
|>
|> Interesting question. And what if ALL the descriptors being polled are
|> closed this way. It would seem either the poll() will now wait on nothing
|> that is there to wait on, and either has to wake up immediately, or has
|> to wait until the timeout (which could be forever).
|>
|
| I tried it with epoll :) I tried closing 1 of 2 descriptors, and 2 of 2
| descriptors I was waiting on. epoll_wait() does nothing. continues to wait. A
| few times under debugger I got it to return with EINTR, but im guessing that was
| something to do with the debugger.
|
| And anyways EINTR wouldnt do you much good eh? How would you know that a file
| descriptor closed and which one it was.

With threads running, which thread is supposed to get the signal and thus
cause a syscall to wake up with EINTR. As far as I know, one cannot depend
on that. The timeout value in the syscall would be the only thing I know
to depend on. Did you try setting it to a time that you could wait for to
see if it still wakes as expected even though all its descriptors are closed?
I would think it still would, but maybe it's good to be sure.

Now, whether select/poll/epoll_wait should wake up with an error on the
closed descriptor when it gets closed, that should be determined by the
standard first. If it specifies, then Linux should do exactly that. If
it does not specify, I'd prefer to see it wake up the waiting thread and
indicate that descriptor has an error condition. But that may be hard to
do if the needed internal data structures to pass the condition are now
gone because of the close.

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2008-02-04-***@ipal.net |
|------------------------------------/-------------------------------------|

David Schwartz

2008-02-04 20:34:41 UTC

Permalink

Post by p***@ipal.net
Now, whether select/poll/epoll_wait should wake up with an error on the
closed descriptor when it gets closed, that should be determined by the
standard first. If it specifies, then Linux should do exactly that.

The standard could not possibly specify what should happen in a
situation that can't be reliably created. The 'as-if' rule would
overrule anything it said. It would be like the standard requiring
'recvfrom' to generate a particular error if a UDP datagram were
dropped.

Post by p***@ipal.net
If
it does not specify, I'd prefer to see it wake up the waiting thread and
indicate that descriptor has an error condition.

Impossible, literally.

Post by p***@ipal.net
But that may be hard to
do if the needed internal data structures to pass the condition are now
gone because of the close.

Please present a complete, compilable POSIX-compliant example that is
guaranteed to close a file descriptor in one thread while another
thread is blocked on 'select' or 'poll'.

You are talking about what should/must happen in a situation that is
impossible to create!

DS

p***@ipal.net

2008-02-05 03:20:16 UTC

Permalink

On Mon, 4 Feb 2008 12:34:41 -0800 (PST) David Schwartz <***@webmaster.com> wrote:
| On Feb 4, 5:45 am, phil-news-***@ipal.net wrote:
|
|> Now, whether select/poll/epoll_wait should wake up with an error on the
|> closed descriptor when it gets closed, that should be determined by the
|> standard first. If it specifies, then Linux should do exactly that.
|
| The standard could not possibly specify what should happen in a
| situation that can't be reliably created. The 'as-if' rule would
| overrule anything it said. It would be like the standard requiring
| 'recvfrom' to generate a particular error if a UDP datagram were
| dropped.

The analog to UDP datagrams being dropped is not the same thing. The
datagram could be dropped entirely asynchronously. It is also defined
as something the sender isn't supposed to be syncronously informed of.

Defining that closing a descriptor creates an error state for any syscall
waiting on tha descriptor, and hence wake up the syscall if it is such
than an error would wake it up, is a syncronous definition. The kernel
has to have a way to notify the waiting process for other kinds of error
states. This (the descriptor becoming closed) is just another one of
those errors.

|> If
|> it does not specify, I'd prefer to see it wake up the waiting thread and
|> indicate that descriptor has an error condition.
|
| Impossible, literally.

Is it also impossible to wake up the waiting thread if the peer of a TCP
connection closes the connection? Of course not. How is closing the
descriptor any different? Sure, it's some more to be done by the code
doing the closing.

|> But that may be hard to
|> do if the needed internal data structures to pass the condition are now
|> gone because of the close.
|
| Please present a complete, compilable POSIX-compliant example that is
| guaranteed to close a file descriptor in one thread while another
| thread is blocked on 'select' or 'poll'.

You of all people should have no difficulty in creating such a scenario.
But it doesn't need to be perfect in that if it fails 1 out of 1000000
because of timing does not matter. The test scenario can be re-run.
It can be made to work _most_ of the time, and that is good enough. It
would set up a network connection somewhere (maybe itself) that would
have nothing being sent. Then create a 2nd thread that waits a specified
period of time (seconds or even minutes). The first thread calls poll()
or select(), depending on which is being tested, with a much longer timeout
in the syscall than the 2nd thread will be waiting for. It then waits for
a readable status on the descriptor in question, as well as an error status.
When the 2nd thread wakes up, it merely closes that descriptor. It can
wait for a while or end depending on how the test scenario needs to proceed.

| You are talking about what should/must happen in a situation that is
| impossible to create!

Nope. Not at all. This isn't about a single threaded process. This is
about multiple threads sharing the same set of descriptors. It is quite
doable.

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2008-02-04-***@ipal.net |
|------------------------------------/-------------------------------------|

David Schwartz

2008-02-06 00:12:52 UTC

Permalink

Post by p***@ipal.net
You of all people should have no difficulty in creating such a scenario.

It is impossible.

Post by p***@ipal.net
But it doesn't need to be perfect in that if it fails 1 out of 1000000
because of timing does not matter. The test scenario can be re-run.

The problem is, the test program can never know whether it succeeded
or failed. It is literally impossible to set up a situation in which
the test program as guaranteed to succeed. As such, the failure
behavior is *always* legal.

Post by p***@ipal.net
It can be made to work _most_ of the time, and that is good enough. It
would set up a network connection somewhere (maybe itself) that would
have nothing being sent. Then create a 2nd thread that waits a specified
period of time (seconds or even minutes). The first thread calls poll()
or select(), depending on which is being tested, with a much longer timeout
in the syscall than the 2nd thread will be waiting for. It then waits for
a readable status on the descriptor in question, as well as an error status.
When the 2nd thread wakes up, it merely closes that descriptor. It can
wait for a while or end depending on how the test scenario needs to proceed.

The problem is, no timeout is guaranteed to be long enough. So no
timeout is guaranteed to provide the behavior. The standard cannot
make a guarantee in a situation where no guarantee is possible. It
literally cannot do so.

Post by p***@ipal.net
| You are talking about what should/must happen in a situation that is
| impossible to create!
Nope. Not at all. This isn't about a single threaded process. This is
about multiple threads sharing the same set of descriptors. It is quite
doable.

Do it. Provide a program that is 100% guaranteed by the standard to
produce the behavior where one thread closes a descriptor while
another thread is polling on it.

That is the only case in which any guarantee the standard did provide
would apply. Since that case does not exist, if the standard did
specify what happened in such a case, it would never apply.

If the standard says "in situation A, you are guaranteed behavior B",
the only failure scenario is when you are guaranteed to be in
situation A and do not get behavior B. If you can never be guaranteed
to be in situation A, there is no case where you are guaranteed
behavior B.

The 'as-if' clause specifically says that if the implementations
"happens to" put you in a particular situation, it may still give you
the behavior for any state it could legally have put you in. The
implementation is not penalized for what it "happens to" do but only
for what it is *required* to do.

DS

p***@ipal.net

2008-02-06 04:26:49 UTC

Permalink

On Tue, 5 Feb 2008 16:12:52 -0800 (PST) David Schwartz <***@webmaster.com> wrote:
| On Feb 4, 7:20 pm, phil-news-***@ipal.net wrote:
|
|> You of all people should have no difficulty in creating such a scenario.
|
| It is impossible.
|
|> But it doesn't need to be perfect in that if it fails 1 out of 1000000
|> because of timing does not matter. The test scenario can be re-run.
|
| The problem is, the test program can never know whether it succeeded
| or failed. It is literally impossible to set up a situation in which
| the test program as guaranteed to succeed. As such, the failure
| behavior is *always* legal.

Yes it can.

|> It can be made to work _most_ of the time, and that is good enough. It
|> would set up a network connection somewhere (maybe itself) that would
|> have nothing being sent. Then create a 2nd thread that waits a specified
|> period of time (seconds or even minutes). The first thread calls poll()
|> or select(), depending on which is being tested, with a much longer timeout
|> in the syscall than the 2nd thread will be waiting for. It then waits for
|> a readable status on the descriptor in question, as well as an error status.
|> When the 2nd thread wakes up, it merely closes that descriptor. It can
|> wait for a while or end depending on how the test scenario needs to proceed.
|
| The problem is, no timeout is guaranteed to be long enough. So no
| timeout is guaranteed to provide the behavior. The standard cannot
| make a guarantee in a situation where no guarantee is possible. It
| literally cannot do so.
|
|> | You are talking about what should/must happen in a situation that is
|> | impossible to create!
|
|> Nope. Not at all. This isn't about a single threaded process. This is
|> about multiple threads sharing the same set of descriptors. It is quite
|> doable.
|
| Do it. Provide a program that is 100% guaranteed by the standard to
| produce the behavior where one thread closes a descriptor while
| another thread is polling on it.

As I have said, 100% is not necessary. 99.999999% is good enough.

| That is the only case in which any guarantee the standard did provide
| would apply. Since that case does not exist, if the standard did
| specify what happened in such a case, it would never apply.
|
| If the standard says "in situation A, you are guaranteed behavior B",
| the only failure scenario is when you are guaranteed to be in
| situation A and do not get behavior B. If you can never be guaranteed
| to be in situation A, there is no case where you are guaranteed
| behavior B.

No.

A test program that _tries_ to get in situation A, and is able to determine
whether or not it succeeded to get in situation A (the determination being
100% accurate even if the attempt could fail in some cases), then the test
is a valid test of the standard.

| The 'as-if' clause specifically says that if the implementations
| "happens to" put you in a particular situation, it may still give you
| the behavior for any state it could legally have put you in. The
| implementation is not penalized for what it "happens to" do but only
| for what it is *required* to do.

Then fix the 'as-if' clause.

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2008-02-05-***@ipal.net |
|------------------------------------/-------------------------------------|

David Schwartz

2008-02-06 11:47:48 UTC

Permalink

Post by p***@ipal.net
A test program that _tries_ to get in situation A, and is able to determine
whether or not it succeeded to get in situation A (the determination being
100% accurate even if the attempt could fail in some cases), then the test
is a valid test of the standard.

I agree. The thing is, it cannot determine whether or not it
succeeded.

If you think that is true, write a conforming program that can 100%
determine whether called 'close' before or during another thread's
call to 'poll' for the same descriptor.

The only way you can do it is to ensure the 'close' occurs before the
'poll'. There is no way you can make it possible for the 'close' to
occur during the 'poll' and still detect what happened.

Try it, it's impossible.

You are correct that if you could know that you had succeeded, the
standard could specify what happens in that case. But you can't even
know that you succeeded.

Keep in mind, the implementation may open and close files, pipes, and
sockets behind your back at any time. If you 'close' a descriptor, it
is always possible for the pthreads implementation itself to open
something and get that same descriptor.

DS

p***@ipal.net

2008-02-06 14:22:03 UTC

Permalink

On Wed, 6 Feb 2008 03:47:48 -0800 (PST) David Schwartz <***@webmaster.com> wrote:
| On Feb 5, 8:26 pm, phil-news-***@ipal.net wrote:
|
|> A test program that _tries_ to get in situation A, and is able to determine
|> whether or not it succeeded to get in situation A (the determination being
|> 100% accurate even if the attempt could fail in some cases), then the test
|> is a valid test of the standard.
|
| I agree. The thing is, it cannot determine whether or not it
| succeeded.

Maybe you can't. But I can.

| If you think that is true, write a conforming program that can 100%
| determine whether called 'close' before or during another thread's
| call to 'poll' for the same descriptor.

The logic is easy. The time taken just editing such a thing is not
wether the effort given that the logic is so easy.

| The only way you can do it is to ensure the 'close' occurs before the
| 'poll'. There is no way you can make it possible for the 'close' to
| occur during the 'poll' and still detect what happened.

I know you have provided lots of smart answers in this group before so I
am utterly perplexed as to why you can't envision this at all. Or has
someone else taken over your userid?

I'll explain once more. But I'm growing tired of this and will stop
repeating the obvious very soon.

A process starts thread A which creates a descriptor to somewhere that
will appear to be ready to receive data, but never will, such as a TCP
connection to some "server" that never sends any data. The descriptor
number is stored somewhere so thread B can get it. Thread B is started.
Thread B goes to sleep for 1 HOUR. Thread A does a poll() call on that
descriptor, checking for read data, with a timeout value of 1 WEEK.
After 1 HOUR of sleeping, thread B wakes up and closes the descriptor.
If there is an error from close() it reports it. Then thread B quits.
Whenever thread A returns from poll(), be that immediate, or near when
the the close happened, or a week later, it checks the return codes,
errno, and the status in the poll list. It also checks to see if it
was in the poll() call for less than the timeout specified. My idea
for the standard is for the scenario in question to set an error status
like POLLERR for the descriptor involved. Thread A checks for errors
that poll() call returned. If it get EBADF it reports that it failed to
reach the poll() call in a timely manner and that thread B's one hour
delay needs to be extended on this way too slow system. Otherwise it
reports whether it woke up with POLLERR on the descriptor or not.

| Try it, it's impossible.

It's way too simple for me to even put in the effort. I know it can be
done.

| You are correct that if you could know that you had succeeded, the
| standard could specify what happens in that case. But you can't even
| know that you succeeded.

If the poll() call returns EBADF then the poll() did run AFTER that
descriptor was closed. If poll() successfully enters, then that
descriptor was still open at the time poll() runs, and the scenario
is under test.

| Keep in mind, the implementation may open and close files, pipes, and
| sockets behind your back at any time. If you 'close' a descriptor, it
| is always possible for the pthreads implementation itself to open
| something and get that same descriptor.

The "implementation" is the kernel. Why would the kernel do that? Just
because the standards says that is not defined? I think we can REASONABLY
assume a REASONABLE kernel that does not just make up descriptors. Who
said anything about using pthreads in a test program? It is not pthreads
that would be tested, it is the kernel syscall layer. And the test program
itself would not be coded to risk its own test by trying to do any other
open calls (or dup or whatever).

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2008-02-06-***@ipal.net |
|------------------------------------/-------------------------------------|

David Schwartz

2008-02-06 16:13:45 UTC

Permalink

Post by p***@ipal.net
A process starts thread A which creates a descriptor to somewhere that
will appear to be ready to receive data, but never will, such as a TCP
connection to some "server" that never sends any data. The descriptor
number is stored somewhere so thread B can get it. Thread B is started.
Thread B goes to sleep for 1 HOUR. Thread A does a poll() call on that
descriptor, checking for read data, with a timeout value of 1 WEEK.
After 1 HOUR of sleeping, thread B wakes up and closes the descriptor.
If there is an error from close() it reports it. Then thread B quits.
Whenever thread A returns from poll(), be that immediate, or near when
the the close happened, or a week later, it checks the return codes,
errno, and the status in the poll list. It also checks to see if it
was in the poll() call for less than the timeout specified. My idea
for the standard is for the scenario in question to set an error status
like POLLERR for the descriptor involved. Thread A checks for errors
that poll() call returned. If it get EBADF it reports that it failed to
reach the poll() call in a timely manner and that thread B's one hour
delay needs to be extended on this way too slow system. Otherwise it
reports whether it woke up with POLLERR on the descriptor or not.

That doesn't work. If it gets 'EBADF' it has no way to know which of
the following two scenarios occurred:

1) The scheduler chose to delay the thread for a very long time, as it
has every right to do, and it still managed to sneak in the 'close'
before the 'poll', or

2) The 'close' occurred during the 'poll', and the platform chose to
act as if the 'poll' was delayed by returning 'EBADF'.

Post by p***@ipal.net
| Try it, it's impossible.
It's way too simple for me to even put in the effort. I know it can be
done.

Then do it. Tell me how you distinguish case 1 from case 2 above.

Post by p***@ipal.net
| You are correct that if you could know that you had succeeded, the
| standard could specify what happens in that case. But you can't even
| know that you succeeded.
If the poll() call returns EBADF then the poll() did run AFTER that
descriptor was closed.

No, the 'poll' could have run before the descriptor was closed and the
platform decided to act as if the 'poll' had run after, knowing that
the program cannot tell the difference between these two cases.

Try again.

Post by p***@ipal.net
If poll() successfully enters, then that
descriptor was still open at the time poll() runs, and the scenario
is under test.

But you can't tell whether poll successfully enters or not. How can
you tell?

Post by p***@ipal.net
| Keep in mind, the implementation may open and close files, pipes, and
| sockets behind your back at any time. If you 'close' a descriptor, it
| is always possible for the pthreads implementation itself to open
| something and get that same descriptor.
The "implementation" is the kernel.

No. The implementation is everything needed to provide the functions
the standard asks for. This includes the pthreads library.

Post by p***@ipal.net
Why would the kernel do that?

Well, let's say when you call 'pthread_create' it starts a monitor
thread. Let's say that monitor threads runs from time to time to check
on threads that need to be reaped and other such things. It's possible
that monitor thread might need to open a file, maybe a system log file
to report on memory usage, who knows?

Post by p***@ipal.net
Just
because the standards says that is not defined? I think we can REASONABLY
assume a REASONABLE kernel that does not just make up descriptors.

No, we can't assume that. POSIX-defined library functions open
descriptors all the time. If you call "gethostbyname" and DNS is used,
it's not unreasonable that a persistent TCP connection might be opened
up to the nameserver, for example.

Post by p***@ipal.net
Who
said anything about using pthreads in a test program? It is not pthreads
that would be tested, it is the kernel syscall layer. And the test program
itself would not be coded to risk its own test by trying to do any other
open calls (or dup or whatever).

So you're saying perhaps the standard should guarantee a particular
behavior for processes that don't use pthreads even though it can't
keep that guarantee for processes that don't? There is no relevant
multi-threading standard other than pthreads for multiple-threads on
systems that even support 'poll', so now you're talking about
guarantees on a hypothetical platform that doesn't apply to any real-
world platform?

You're way out there now.

For practical purposes, we're talking about what POSIX threads say
about what happens when on thread calls 'close' on a descriptor while
another thread is, or is about to, call 'poll' on it.

DS

p***@ipal.net

2008-02-07 02:59:17 UTC

Permalink

On Wed, 6 Feb 2008 08:13:45 -0800 (PST) David Schwartz <***@webmaster.com> wrote:
| On Feb 6, 6:22 am, phil-news-***@ipal.net wrote:
|
|> A process starts thread A which creates a descriptor to somewhere that
|> will appear to be ready to receive data, but never will, such as a TCP
|> connection to some "server" that never sends any data. The descriptor
|> number is stored somewhere so thread B can get it. Thread B is started.
|> Thread B goes to sleep for 1 HOUR. Thread A does a poll() call on that
|> descriptor, checking for read data, with a timeout value of 1 WEEK.
|> After 1 HOUR of sleeping, thread B wakes up and closes the descriptor.
|> If there is an error from close() it reports it. Then thread B quits.
|> Whenever thread A returns from poll(), be that immediate, or near when
|> the the close happened, or a week later, it checks the return codes,
|> errno, and the status in the poll list. It also checks to see if it
|> was in the poll() call for less than the timeout specified. My idea
|> for the standard is for the scenario in question to set an error status
|> like POLLERR for the descriptor involved. Thread A checks for errors
|> that poll() call returned. If it get EBADF it reports that it failed to
|> reach the poll() call in a timely manner and that thread B's one hour
|> delay needs to be extended on this way too slow system. Otherwise it
|> reports whether it woke up with POLLERR on the descriptor or not.
|
| That doesn't work. If it gets 'EBADF' it has no way to know which of
| the following two scenarios occurred:
|
| 1) The scheduler chose to delay the thread for a very long time, as it
| has every right to do, and it still managed to sneak in the 'close'
| before the 'poll', or
|
| 2) The 'close' occurred during the 'poll', and the platform chose to
| act as if the 'poll' was delayed by returning 'EBADF'.

Either way, the platform is presenting the same concept. Acting as if
the close occurred before is equivalent as if it had actually happened.
If this is the error you get, it won't matter.

|> | Try it, it's impossible.
|
|> It's way too simple for me to even put in the effort. I know it can be
|> done.
|
| Then do it. Tell me how you distinguish case 1 from case 2 above.

You don't need to.

|> | You are correct that if you could know that you had succeeded, the
|> | standard could specify what happens in that case. But you can't even
|> | know that you succeeded.
|
|> If the poll() call returns EBADF then the poll() did run AFTER that
|> descriptor was closed.
|
| No, the 'poll' could have run before the descriptor was closed and the
| platform decided to act as if the 'poll' had run after, knowing that
| the program cannot tell the difference between these two cases.
|
| Try again.

Your logic ... that a platform could choose to do the wrong thing and
give the wrong perception ... could be applied to just about every
interface that is standardized. For example, even though a file does
exist, the platform could decide to act as if it does not and give an
open(,O_RDONLY) an ENOENT error. Even though the file could actually
exist, the platform is _pretending_ that it does not. This does not
mean we cannot have standards for the open() syscall.

|> If poll() successfully enters, then that
|> descriptor was still open at the time poll() runs, and the scenario
|> is under test.
|
| But you can't tell whether poll successfully enters or not. How can
| you tell?

You can tell if the platform wants to let you think it. If the platform
decides to fake a scenario different that actually happened, this is
very much possible for just about anything.

|> | Keep in mind, the implementation may open and close files, pipes, and
|> | sockets behind your back at any time. If you 'close' a descriptor, it
|> | is always possible for the pthreads implementation itself to open
|> | something and get that same descriptor.
|
|> The "implementation" is the kernel.
|
| No. The implementation is everything needed to provide the functions
| the standard asks for. This includes the pthreads library.

One can do these things without the pthreads library. So what if the
way of doing them is not standard. We don't have to define how the
descriptor got closed ... only that it does. So what if a process
calls the Linux specific clone() syscall with CLONE_FILES set, to make
raw threads.

|> Why would the kernel do that?
|
| Well, let's say when you call 'pthread_create' it starts a monitor
| thread. Let's say that monitor threads runs from time to time to check
| on threads that need to be reaped and other such things. It's possible
| that monitor thread might need to open a file, maybe a system log file
| to report on memory usage, who knows?

To create a proper test program for a kernel interface, you can be sure
I would not use pthreads. And what you described is not the only reason
for such a decision on how to design a test module for this as part of
a robust test suite.

|> Just
|> because the standards says that is not defined? I think we can REASONABLY
|> assume a REASONABLE kernel that does not just make up descriptors.
|
| No, we can't assume that. POSIX-defined library functions open
| descriptors all the time. If you call "gethostbyname" and DNS is used,
| it's not unreasonable that a persistent TCP connection might be opened
| up to the nameserver, for example.
|
|> Who
|> said anything about using pthreads in a test program? It is not pthreads
|> that would be tested, it is the kernel syscall layer. And the test program
|> itself would not be coded to risk its own test by trying to do any other
|> open calls (or dup or whatever).
|
| So you're saying perhaps the standard should guarantee a particular
| behavior for processes that don't use pthreads even though it can't
| keep that guarantee for processes that don't? There is no relevant
| multi-threading standard other than pthreads for multiple-threads on
| systems that even support 'poll', so now you're talking about
| guarantees on a hypothetical platform that doesn't apply to any real-
| world platform?

A thread is merely a way to get a descriptor to be closed. We could also
force it closed by adding a facility to do that in some other way. The
point is, the proposal I suggest for poll/select behaviour when one or
all of the descriptors is closed is not based on _how_ it gets closed,
but rather, that it does get closed, however that may be. A variant of
pthreads that chooses not do open anything the main program does not do
explicitly could be used as a test basis. Or one can just not use the
pthreads library at all.

| You're way out there now.

I'm outside the box.

| For practical purposes, we're talking about what POSIX threads say
| about what happens when on thread calls 'close' on a descriptor while
| another thread is, or is about to, call 'poll' on it.

Almost.

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2008-02-06-***@ipal.net |
|------------------------------------/-------------------------------------|

David Schwartz

2008-02-08 01:02:00 UTC

Permalink

Post by p***@ipal.net
Your logic ... that a platform could choose to do the wrong thing and
give the wrong perception ... could be applied to just about every
interface that is standardized.

That is correct.

Post by p***@ipal.net
For example, even though a file does
exist, the platform could decide to act as if it does not and give an
open(,O_RDONLY) an ENOENT error. Even though the file could actually
exist, the platform is _pretending_ that it does not. This does not
mean we cannot have standards for the open() syscall.

It just means those standards cannot distinguish the case where the
file exists, but the implementation decides to hide it from the
program, from the case where the file does not exist.

Post by p***@ipal.net
|> If poll() successfully enters, then that
|> descriptor was still open at the time poll() runs, and the scenario
|> is under test.
|
| But you can't tell whether poll successfully enters or not. How can
| you tell?
You can tell if the platform wants to let you think it. If the platform
decides to fake a scenario different that actually happened, this is
very much possible for just about anything.

The platform may "fake a scenario different that actually happened"
provided no compliant program has any other way to know what actually
happened. If the program has another way, then the platform cannot
fake it.

For example, if the standard provides two ways to check if a file
exists, they must provide consistent results. But if there's only way
to check if a file exists, then a file exists if and only if that one
way says so. The platform may certainly hide files from the program.

Post by p***@ipal.net
|> | Keep in mind, the implementation may open and close files, pipes, and
|> | sockets behind your back at any time. If you 'close' a descriptor, it
|> | is always possible for the pthreads implementation itself to open
|> | something and get that same descriptor.
|
|> The "implementation" is the kernel.
|
| No. The implementation is everything needed to provide the functions
| the standard asks for. This includes the pthreads library.
One can do these things without the pthreads library. So what if the
way of doing them is not standard. We don't have to define how the
descriptor got closed ... only that it does. So what if a process
calls the Linux specific clone() syscall with CLONE_FILES set, to make
raw threads.

If we aren't talking about a standard, then we can do anything we
want, but it's really not particularly interesting. The 'poll'
function and the 'close' function are standardized. Pthreads are a
standard. Guarantees that aren't within the scope of the standard
wouldn't be particularly helpful to people using those interface.
Worse, it would let people who relied on them right programs that are
harder to port.

IMO, that would be a huge step backwards.

Post by p***@ipal.net
|> Why would the kernel do that?
|
| Well, let's say when you call 'pthread_create' it starts a monitor
| thread. Let's say that monitor threads runs from time to time to check
| on threads that need to be reaped and other such things. It's possible
| that monitor thread might need to open a file, maybe a system log file
| to report on memory usage, who knows?
To create a proper test program for a kernel interface, you can be sure
I would not use pthreads. And what you described is not the only reason
for such a decision on how to design a test module for this as part of
a robust test suite.

If you're talking specifically about the Linux kernel interface, then
you can do whatever you want, there's no standard. I just don't think
that would be particularly useful. It may improve Linux's
implementation of POSIX and pthreads, but it wouldn't affect the vast
majority of programs that use 'poll'.

Post by p***@ipal.net
| So you're saying perhaps the standard should guarantee a particular
| behavior for processes that don't use pthreads even though it can't
| keep that guarantee for processes that don't? There is no relevant
| multi-threading standard other than pthreads for multiple-threads on
| systems that even support 'poll', so now you're talking about
| guarantees on a hypothetical platform that doesn't apply to any real-
| world platform?
A thread is merely a way to get a descriptor to be closed. We could also
force it closed by adding a facility to do that in some other way. The
point is, the proposal I suggest for poll/select behaviour when one or
all of the descriptors is closed is not based on _how_ it gets closed,
but rather, that it does get closed, however that may be. A variant of
pthreads that chooses not do open anything the main program does not do
explicitly could be used as a test basis. Or one can just not use the
pthreads library at all.

This wouldn't help in any realistic cases. Realistic cases are going
to be using pthreads and are going to want guarantees that are
portable and within the standard.

DS

David Schwartz

2008-02-06 11:51:51 UTC

Permalink

Post by p***@ipal.net
| The 'as-if' clause specifically says that if the implementations
| "happens to" put you in a particular situation, it may still give you
| the behavior for any state it could legally have put you in. The
| implementation is not penalized for what it "happens to" do but only
| for what it is *required* to do.
Then fix the 'as-if' clause.

That would prohibit a large number of optimizations that are provably
harmless. It would not permit an application to do anything it can't
already do. So there would be a definite harm and no benefit. There is
no harm in a violation that no compliant program can detect.

The standard specifies the *observable* behavior a *compliant*
application will see. That's its job and its scope. It provides
guarantees that can be relied upon and adding "unreliable guarantees"
serves no useful purpose.

DS

p***@ipal.net

2008-02-06 14:26:37 UTC

Permalink

On Wed, 6 Feb 2008 03:51:51 -0800 (PST) David Schwartz <***@webmaster.com> wrote:
| On Feb 5, 8:26 pm, phil-news-***@ipal.net wrote:
|
|> | The 'as-if' clause specifically says that if the implementations
|> | "happens to" put you in a particular situation, it may still give you
|> | the behavior for any state it could legally have put you in. The
|> | implementation is not penalized for what it "happens to" do but only
|> | for what it is *required* to do.
|
|> Then fix the 'as-if' clause.
|
| That would prohibit a large number of optimizations that are provably
| harmless. It would not permit an application to do anything it can't
| already do. So there would be a definite harm and no benefit. There is
| no harm in a violation that no compliant program can detect.
|
| The standard specifies the *observable* behavior a *compliant*
| application will see. That's its job and its scope. It provides
| guarantees that can be relied upon and adding "unreliable guarantees"
| serves no useful purpose.

I don't agree. The standard specifies behaviours that COULD be seen by
applications, and would be accurately seen by those that are compliant
*AND* are doign correct steps to see the behaviour (there are many ways
to do steps that are incorrect with respect to testing compliance by the
kernel, but still achieve what the program tries to do under a scnario
where the kernel is compliant ... and merely _may_ achieve that when the
kernel is not compliant).

If we never test an implementation, the kernel could still be compliant,
or not. If we run a program that is not test grade quality, it may or may
not do what it intends to do, orthagonal of the compliance of the kernel.

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2008-02-06-***@ipal.net |
|------------------------------------/-------------------------------------|

David Schwartz

2008-02-04 20:32:25 UTC

Permalink

Post by shebble
On Linux, lets say 2.6 kernel...
Can you close a file descriptor in one thread while another thread is using it
in a select(), poll(), or epoll_wait() ?
What will happen?

For 'select' and 'poll' it is literally impossible to do this
reliably. There is simply no way you can ensure the 'close' comes
*AFTER* the 'select' or 'poll' begins as opposed to right before. So
this will always be undefined.

As for 'epoll_wait', it's perfectly legal. If that's the last
reference to the file, it will not longer be waited on. One of the
advantages of 'epoll' is that you don't need the synchronization you
need with 'poll' and 'select'.

DS

shebble

2008-02-05 01:52:54 UTC

Permalink

Post by David Schwartz

Post by shebble
On Linux, lets say 2.6 kernel...
Can you close a file descriptor in one thread while another thread is using it
in a select(), poll(), or epoll_wait() ?
What will happen?

Ok, I see what you are saying. At the time you call close(), you don't know if
your other thread is in select() or is just about to select(). Even if you put a
pthread condition right before the select() call for example. But I don't think
that matters.? The point is it COULD happen when already in select().

Post by David Schwartz
As for 'epoll_wait', it's perfectly legal. If that's the last
reference to the file, it will not longer be waited on. One of the
advantages of 'epoll' is that you don't need the synchronization you
need with 'poll' and 'select'.
DS

So epoll should just do nothing? It is like the file descriptor is removed and
was never part of the set?

What I am seeing is that it still blocks, no return. I did not specify a timeout
though. So I guess that makes sense. Now it is waiting forever on... nothing.

p***@ipal.net

2008-02-05 03:38:33 UTC

Permalink

On Mon, 04 Feb 2008 18:52:54 -0700 shebble <***@example.com> wrote:
| David Schwartz wrote:
|> On Feb 2, 9:31 pm, shebble <***@example.com> wrote:
|>
|>> On Linux, lets say 2.6 kernel...
|>>
|>> Can you close a file descriptor in one thread while another thread is using it
|>> in a select(), poll(), or epoll_wait() ?
|>>
|>> What will happen?
|>
|> For 'select' and 'poll' it is literally impossible to do this
|> reliably. There is simply no way you can ensure the 'close' comes
|> *AFTER* the 'select' or 'poll' begins as opposed to right before. So
|> this will always be undefined.
|>
|
| Ok, I see what you are saying. At the time you call close(), you don't know if
| your other thread is in select() or is just about to select(). Even if you put a
| pthread condition right before the select() call for example. But I don't think
| that matters.? The point is it COULD happen when already in select().

You can create the situation with a very high likelyhood of happening
that way by having the thread that will do the close() wait for a few
seconds, minutes, hours, or days, depending on what kind of chance you
are dealing with that the waiter thread will be delayed getting into
the long term select() call. You can also detect the unlikely event
that the descriptor gets closed before select() gets to wait on it as
that will give the select() caller EBADF. It can exit the progrma with
a status that indicates this event happened and the wrapper script can
run it again.

How likely is it that the thread that will call select() will get delayed
for a few days ... say because the system is overloaded when you happen
to be running the test case to see if the behaviour complies with whatever
is specified for this kind of scenario? Hint: it's non-zero but very
close to zero. If the program is set to do a 72 hour wait before closing
the descriptor the other thread is waiting on for a message that will
never arrive, I bet you will never see a failure scenario. In fact I
suggest that it will be a major struggle to get it to fail if you set
it to just a mere minute. And so what if the test could fail in rare
occaisions. We only need to see how the system behaves in the cases
where it succeeds.

|> As for 'epoll_wait', it's perfectly legal. If that's the last
|> reference to the file, it will not longer be waited on. One of the
|> advantages of 'epoll' is that you don't need the synchronization you
|> need with 'poll' and 'select'.
|>
|> DS
|
| So epoll should just do nothing? It is like the file descriptor is removed and
| was never part of the set?
|
| What I am seeing is that it still blocks, no return. I did not specify a timeout
| though. So I guess that makes sense. Now it is waiting forever on... nothing.

It depends on definitions. While I prefer having the definition be such
that the close() of the descriptor causes a behaviour like any other error
condition coming about, I could deal with other definitions. In any case,
I see no issue with creating a test scenario.

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2008-02-04-***@ipal.net |
|------------------------------------/-------------------------------------|

David Schwartz

2008-02-05 03:45:59 UTC

Permalink

Post by p***@ipal.net
It depends on definitions. While I prefer having the definition be such
that the close() of the descriptor causes a behaviour like any other error
condition coming about, I could deal with other definitions. In any case,
I see no issue with creating a test scenario.

What if the implementation decides to have 'select' immediately return
EBADF in that case? Can the standard prohibit it from doing that? How
could a conforming application tell?

The standard can only require behavior that a conforming application
can detect. The 'as-if' rule means that any requirements that a
conforming application cannot detect are as if they didn't exist. So
even if the platform said something, it would be voided by the 'as-if'
rule.

In any event, the standards don't say anything. It's logically
equivalent to one thread calling 'free' on a memory block while
another thread is or might be using it. Crashing is permitted by
POSIX.

DS

p***@ipal.net

2008-02-05 04:30:00 UTC

Permalink

On Mon, 4 Feb 2008 19:45:59 -0800 (PST) David Schwartz <***@webmaster.com> wrote:
| On Feb 4, 7:38 pm, phil-news-***@ipal.net wrote:
|
|> It depends on definitions. While I prefer having the definition be such
|> that the close() of the descriptor causes a behaviour like any other error
|> condition coming about, I could deal with other definitions. In any case,
|> I see no issue with creating a test scenario.
|
| What if the implementation decides to have 'select' immediately return
| EBADF in that case? Can the standard prohibit it from doing that? How
| could a conforming application tell?

I'm not going to judge what the standard can or cannot do. Maybe we should
discuss it in terms of creating a standard. I have my ideas of what such a
standard should be. It would include things like extending POLLERR to also
apply to input polling as well as output polling. It would be set for the
descriptor that got closed for that poll() call. Normal code getting such
a result would be expected to try doing the read() or write() previously
attempted to see what error gets returned as a result of that call. They
would get EBADF unless some thread opened a new descriptor that got that
number. That's a scenario (changed descriptor) that any threaded app could
face anyway in other parts of logic (let's say the poll() returns that data
is available to read, but before it gets around to do read() some other
thread closes the descriptor and opens one with the same number, or even
calls dup2() to ensure such trickery). But a standard only needs to define
what poll() or select() would do, not how to program around it.

| The standard can only require behavior that a conforming application
| can detect. The 'as-if' rule means that any requirements that a
| conforming application cannot detect are as if they didn't exist. So
| even if the platform said something, it would be voided by the 'as-if'
| rule.

A conforming application can detect it.

| In any event, the standards don't say anything. It's logically
| equivalent to one thread calling 'free' on a memory block while
| another thread is or might be using it. Crashing is permitted by
| POSIX.

Of course. And maybe the best course is for the standard to remain moot
on the subject of how to handle poll() or select() getting a descriptor
pulled out from under it by another thread close()ing it. Although I do
have my preference for what the standard should say, that is not the only
option.

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2008-02-04-***@ipal.net |
|------------------------------------/-------------------------------------|

David Schwartz

2008-02-06 00:07:56 UTC

Permalink

Post by p***@ipal.net
A conforming application can detect it.

No, it cannot, since a conforming application *cannot* close a
descriptor in one thread while another thread is or might be using it.

My other response covers this in more detail. You are asking for the
literally impossible.

DS

p***@ipal.net

2008-02-06 04:28:21 UTC

Permalink

On Tue, 5 Feb 2008 16:07:56 -0800 (PST) David Schwartz <***@webmaster.com> wrote:
| On Feb 4, 8:30 pm, phil-news-***@ipal.net wrote:
|
|> A conforming application can detect it.
|
| No, it cannot, since a conforming application *cannot* close a
| descriptor in one thread while another thread is or might be using it.

What makes you think that is so?

| My other response covers this in more detail. You are asking for the
| literally impossible.

What error from close() happens in the closer thread when that descriptor
is being waited on in poll() in the waiter thread? Is this a standard?

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2008-02-05-***@ipal.net |
|------------------------------------/-------------------------------------|

David Schwartz

2008-02-05 03:43:46 UTC

Permalink

Post by shebble

Post by David Schwartz
For 'select' and 'poll' it is literally impossible to do this
reliably. There is simply no way you can ensure the 'close' comes
*AFTER* the 'select' or 'poll' begins as opposed to right before. So
this will always be undefined.

Exactly, and hence the standard *cannot* require that something
different happen. If you cannot force case A over case B, the standard
cannot say that something must happen in case A that cannot happen in
case B.

Post by shebble

So epoll should just do nothing? It is like the file descriptor is removed and
was never part of the set?

There is a huge difference between the logic of 'epoll' and the logic
of 'poll'/'select'. The 'epoll' function waits on the epoll set, which
is a logically-persistent thing. The 'epoll_wait' function does not
modify the epoll set in any way, it just waits until the epoll set
gets an event or events.

Post by shebble
What I am seeing is that it still blocks, no return. I did not specify a timeout
though. So I guess that makes sense. Now it is waiting forever on... nothing.

That is what it should do, since another thread might add a descriptor
to the set at any time.

DS

p***@ipal.net

2008-02-05 04:33:45 UTC

Permalink

On Mon, 4 Feb 2008 19:43:46 -0800 (PST) David Schwartz <***@webmaster.com> wrote:
| On Feb 4, 5:52 pm, shebble <***@example.com> wrote:
|> David Schwartz wrote:
|
|> > For 'select' and 'poll' it is literally impossible to do this
|> > reliably. There is simply no way you can ensure the 'close' comes
|> > *AFTER* the 'select' or 'poll' begins as opposed to right before. So
|> > this will always be undefined.
|
|> Ok, I see what you are saying. At the time you call close(), you don't know if
|> your other thread is in select() or is just about to select(). Even if you put a
|> pthread condition right before the select() call for example. But I don't think
|> that matters.? The point is it COULD happen when already in select().
|
| Exactly, and hence the standard *cannot* require that something
| different happen. If you cannot force case A over case B, the standard
| cannot say that something must happen in case A that cannot happen in
| case B.

That is the silliest thing I have heard.

A standard sure can say what will happening in a given scenario. Just
because there might be the very slightest possibility of a program not
succeeding in creating that scenario ONE TIME out of a BILLION does not
mean we can't understand the scenario AND define how the kernel is to
behave when the scenario does happen 999999999 times out of 1000000000.

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2008-02-04-***@ipal.net |
|------------------------------------/-------------------------------------|

David Schwartz

2008-02-05 21:19:44 UTC

Permalink

Post by p***@ipal.net
A standard sure can say what will happening in a given scenario.

Only if a conforming application can create that scenario.

Post by p***@ipal.net
Just
because there might be the very slightest possibility of a program not
succeeding in creating that scenario ONE TIME out of a BILLION does not
mean we can't understand the scenario AND define how the kernel is to
behave when the scenario does happen 999999999 times out of 1000000000.

Yes, that's exactly what it means.

If a compliant program can never be certain it has created the
scenario, it can never demonstrate a violation if it doesn't get the
behavior required for the scenario. The 'as-if' rule in every standard
says that only behavior that can be detected by a compliant program
can be specified by the standard.

If a compliant program cannot create case A and be sure it has not
created case B, the standard *cannot* say that the behavior in case B
is not legal in case A.

This is, honestly, one of the most important things to understand
about standards, and it surprises me that people still don't
understand it.

If a compliant program cannot know that it is not in case B, then the
standard cannot prohibit the behavior it permits in case B. It's that
simple.

DS

p***@ipal.net

2008-02-06 04:35:57 UTC

Permalink

On Tue, 5 Feb 2008 13:19:44 -0800 (PST) David Schwartz <***@webmaster.com> wrote:
| On Feb 4, 8:33 pm, phil-news-***@ipal.net wrote:
|
|> A standard sure can say what will happening in a given scenario.
|
| Only if a conforming application can create that scenario.
|
|> Just
|> because there might be the very slightest possibility of a program not
|> succeeding in creating that scenario ONE TIME out of a BILLION does not
|> mean we can't understand the scenario AND define how the kernel is to
|> behave when the scenario does happen 999999999 times out of 1000000000.
|
| Yes, that's exactly what it means.
|
| If a compliant program can never be certain it has created the
| scenario, it can never demonstrate a violation if it doesn't get the
| behavior required for the scenario. The 'as-if' rule in every standard
| says that only behavior that can be detected by a compliant program
| can be specified by the standard.

The test program can TRY to create the situation. It may, or may not,
succeed at creating it. But it can KNOW whether it succeeded or not.
If it succeeded, it can examine the results for correct behaviour.
If it failed, it can report that it needs to be re-run for a valid
test of the (suggested) standard. If a test program only rarely is
able to create the scenario, it would be considered of low value.
If a test program only rarely fails to create the scenario, then it
would be good enough, especially if wrapped in a script that checks
for the failure and automatically reruns it to try again.

| If a compliant program cannot create case A and be sure it has not
| created case B, the standard *cannot* say that the behavior in case B
| is not legal in case A.

I only am suggesting a new behaviour in one case, not two.

| This is, honestly, one of the most important things to understand
| about standards, and it surprises me that people still don't
| understand it.

Maybe you should try to explain better. Either your whole idea is just
totally wrong (and this might explain why some standards have avoided
doing some things we've), or you haven't done a very good job of
explaining the concept you seem to be trying to explain. Of course it
does not help much to say something is impossible when it seems to be
so clearly very possible.

| If a compliant program cannot know that it is not in case B, then the
| standard cannot prohibit the behavior it permits in case B. It's that
| simple.

Not all programs need to even care. The TEST program needs to care.
And what I proposed involves a scenario that CAN be reliably tested
for, even if it cannot be reliably created.

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2008-02-05-***@ipal.net |
|------------------------------------/-------------------------------------|

David Schwartz

2008-02-06 16:16:35 UTC

Permalink

Post by p***@ipal.net
A standard sure can say what will happening in a given scenario. Just
because there might be the very slightest possibility of a program not
succeeding in creating that scenario ONE TIME out of a BILLION does not
mean we can't understand the scenario AND define how the kernel is to
behave when the scenario does happen 999999999 times out of 1000000000.

Yes, that's exactly what it means. If a compliant program cannot know
it is not in case A, then any behavior legal for case A is legal in
that case.

There is simply no rational reason for a standard to prohibit
*undetectable* violations of its terms. So they all contain an 'as if'
rule that basically says that the standard only specifies the
observable behavior of the implementation, and the implementation is
free to violate the wording of the standard provided such a violation
is unobservable (cannot be detected *ever* by *any* compliant
program).

If the program cannot be sure it is not in case A, then no matter what
the standard says, the behavior permitted for case A is permitted for
the case the program is in.

DS

p***@ipal.net

2008-02-05 03:29:03 UTC

Permalink

On Mon, 4 Feb 2008 12:32:25 -0800 (PST) David Schwartz <***@webmaster.com> wrote:
| On Feb 2, 9:31 pm, shebble <***@example.com> wrote:
|
|> On Linux, lets say 2.6 kernel...
|>
|> Can you close a file descriptor in one thread while another thread is using it
|> in a select(), poll(), or epoll_wait() ?
|>
|> What will happen?
|
| For 'select' and 'poll' it is literally impossible to do this
| reliably. There is simply no way you can ensure the 'close' comes
| *AFTER* the 'select' or 'poll' begins as opposed to right before. So
| this will always be undefined.

The scenario is possible to create. A test scenario does not have to
be perfect by being sure the system timing can't allow the thread that
does the closing of the descriptor until after the select/poll. You can
have the select/poll wait for a very long time in a test case that never
gets data. You can have the closer thread also wait a very long time to
minimize the statistical chance of the closer doing the close before the
thread that does the select/poll actually getting into the kernel action
to do that. In the very extreme cases where it takes the waiter thread
minutes to get into select/poll due to weird system performance, you can
simply dismiss that test run as "not what we wanted to test". It will
result in an error in select/poll as if waiting on a descriptor that is
not open. So the program can even detect if it failed to carry out the
intended test.

So we can easily have a test program that will easily be able to carry
out the test in virtually all instances of being run, and readily detect
when it failed to do so (and possibly even just try again up to some
number of times).

So we _can_ test what the system _would_ do in this case. So that is no
impediment to defining what it _should_ or _must_ do when it happens.

| As for 'epoll_wait', it's perfectly legal. If that's the last
| reference to the file, it will not longer be waited on. One of the
| advantages of 'epoll' is that you don't need the synchronization you
| need with 'poll' and 'select'.

I agree that epoll_wait is a better way to do things. But I do not see
it as being the exclusive way to deal with this kind of error event.

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2008-02-04-***@ipal.net |
|------------------------------------/-------------------------------------|

David Schwartz

2008-02-05 03:50:28 UTC

Permalink

Post by p***@ipal.net
The scenario is possible to create.

It is not.

Post by p***@ipal.net
A test scenario does not have to
be perfect by being sure the system timing can't allow the thread that
does the closing of the descriptor until after the select/poll.

It has to *know* that it has succeeded. If it does not know that it
has gotten into a particular state, then the standards do not require
that it see the behavior for that state even if it "just happens to
be" in that state.

The standards never require one behavior if a conforming application
cannot tell the difference.

Post by p***@ipal.net
You can
have the select/poll wait for a very long time in a test case that never
gets data. You can have the closer thread also wait a very long time to
minimize the statistical chance of the closer doing the close before the
thread that does the select/poll actually getting into the kernel action
to do that. In the very extreme cases where it takes the waiter thread
minutes to get into select/poll due to weird system performance, you can
simply dismiss that test run as "not what we wanted to test". It will
result in an error in select/poll as if waiting on a descriptor that is
not open. So the program can even detect if it failed to carry out the
intended test.

You're missing the point. The point is not that you may or may not get
the behavior. The point is, if your code is not *guaranteed* to get
the behavior, nothing requires that it see what happens with that
behavior, even if it coincidentally does get into the state you want
to test.

The standards only say what compliant programs must see when they can
tell the difference. If the program cannot tell the difference between
two states, then anything permitted by the standard for one state is
also permitted for the other.

Post by p***@ipal.net
So we can easily have a test program that will easily be able to carry
out the test in virtually all instances of being run, and readily detect
when it failed to do so (and possibly even just try again up to some
number of times).

The problem is, since the test program will never know for sure that
it is testing this case, the platform is never required to show the
test program what it will do in that case. It is only if the program
*knows* it is in state A that the platform is required to give it the
behavior for state A.

For example, suppose we allocate more memory than the system has. We
(humans) know the platform is out of memory but the program cannot
know that it is out of memory. So even if POSIX says something has to
happen in this case, the 'as-if' rule allows the platform to not do
that. The program has no way to know (and hence no *right* to know)
that it is out of memory.

Post by p***@ipal.net
So we _can_ test what the system _would_ do in this case. So that is no
impediment to defining what it _should_ or _must_ do when it happens.

Right, but we cannot tell what it should or must do, because it only
has to do that for a program that *knows* it is in that situation.

Post by p***@ipal.net
| As for 'epoll_wait', it's perfectly legal. If that's the last
| reference to the file, it will not longer be waited on. One of the
| advantages of 'epoll' is that you don't need the synchronization you
| need with 'poll' and 'select'.
I agree that epoll_wait is a better way to do things. But I do not see
it as being the exclusive way to deal with this kind of error event.

This is undefined behavior. It has caused security problems in the
past.

DS

p***@ipal.net

2008-02-05 04:48:05 UTC

Permalink

On Mon, 4 Feb 2008 19:50:28 -0800 (PST) David Schwartz <***@webmaster.com> wrote:
| On Feb 4, 7:29 pm, phil-news-***@ipal.net wrote:
|
|> The scenario is possible to create.
|
| It is not.
|
|> A test scenario does not have to
|> be perfect by being sure the system timing can't allow the thread that
|> does the closing of the descriptor until after the select/poll.
|
| It has to *know* that it has succeeded. If it does not know that it
| has gotten into a particular state, then the standards do not require
| that it see the behavior for that state even if it "just happens to
| be" in that state.

If the kernel conforms to a standard I have in mind, it would be able
to know it succeeded or failed, whichever the case might be. It would
get EBADF if the descriptor is already closed by the time poll() enters
the kernel code where the kernel atomically checks the list. If it does
get into the poll() waiting state, the close() called by another thread
can set the error flag and make poll() wake up, giving the caller POLLERR
for that descriptor. Depending on which error comes back from poll(),
the calling thread knows. The thread doing the close() can go away and
does not need to ever know.

| The standards never require one behavior if a conforming application
| cannot tell the difference.

Tell the difference of what? That one behaviour happened over another?

|> You can
|> have the select/poll wait for a very long time in a test case that never
|> gets data. You can have the closer thread also wait a very long time to
|> minimize the statistical chance of the closer doing the close before the
|> thread that does the select/poll actually getting into the kernel action
|> to do that. In the very extreme cases where it takes the waiter thread
|> minutes to get into select/poll due to weird system performance, you can
|> simply dismiss that test run as "not what we wanted to test". It will
|> result in an error in select/poll as if waiting on a descriptor that is
|> not open. So the program can even detect if it failed to carry out the
|> intended test.
|
| You're missing the point. The point is not that you may or may not get
| the behavior. The point is, if your code is not *guaranteed* to get
| the behavior, nothing requires that it see what happens with that
| behavior, even if it coincidentally does get into the state you want
| to test.
|
| The standards only say what compliant programs must see when they can
| tell the difference. If the program cannot tell the difference between
| two states, then anything permitted by the standard for one state is
| also permitted for the other.

I'm sure you could come up with some standard to define in which a
program would not be able to see the behaviour. For example we could
require that when a descriptor being waited on gets closed, the computer
must electrocute all dogs within a 10 meter radius. I doubt the program
would be able to detect that. Obviously this is an absurd example and
is intended as such. But I think _my_ idea of what such a standard
should be like would be easily detectable, enough to fully justify
that idea being a viable candidate for such a standard (other ideas
might work just as well).

Maybe we should pick such a hypothetical standard as a basis to talk
about this in a less vaguely abstract way.

|> So we can easily have a test program that will easily be able to carry
|> out the test in virtually all instances of being run, and readily detect
|> when it failed to do so (and possibly even just try again up to some
|> number of times).
|
| The problem is, since the test program will never know for sure that
| it is testing this case, the platform is never required to show the
| test program what it will do in that case. It is only if the program
| *knows* it is in state A that the platform is required to give it the
| behavior for state A.

I don't agree that it will never know for sure. I believe it will know
for sure that it succeeded in testing it in the vast majority of cases
and know that it failed in the extremely few cases it could fail.

| For example, suppose we allocate more memory than the system has. We
| (humans) know the platform is out of memory but the program cannot
| know that it is out of memory. So even if POSIX says something has to
| happen in this case, the 'as-if' rule allows the platform to not do
| that. The program has no way to know (and hence no *right* to know)
| that it is out of memory.

The malloc() call will return NULL. How is that so hard to define?

Of course what is important here is not if the system has or does not
have enough memory for _something_ but rather, if it has enough memory
for the process asking for some. It is a standard defining how an
interface behaves. The definition of malloc() is not about if there
is any memory anywhere, but rather, if some memory can be made available
to the calling process. If not, it gets NULL.

|> So we _can_ test what the system _would_ do in this case. So that is no
|> impediment to defining what it _should_ or _must_ do when it happens.
|
| Right, but we cannot tell what it should or must do, because it only
| has to do that for a program that *knows* it is in that situation.

It's a perfectly definable situation.

|
|> | As for 'epoll_wait', it's perfectly legal. If that's the last
|> | reference to the file, it will not longer be waited on. One of the
|> | advantages of 'epoll' is that you don't need the synchronization you
|> | need with 'poll' and 'select'.
|>
|> I agree that epoll_wait is a better way to do things. But I do not see
|> it as being the exclusive way to deal with this kind of error event.
|
| This is undefined behavior. It has caused security problems in the
| past.

Then maybe it needs to become a defined behavior.

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2008-02-04-***@ipal.net |
|------------------------------------/-------------------------------------|

David Schwartz

2008-02-05 21:23:45 UTC

Permalink

Post by p***@ipal.net
| For example, suppose we allocate more memory than the system has. We
| (humans) know the platform is out of memory but the program cannot
| know that it is out of memory. So even if POSIX says something has to
| happen in this case, the 'as-if' rule allows the platform to not do
| that. The program has no way to know (and hence no *right* to know)
| that it is out of memory.
The malloc() call will return NULL. How is that so hard to define?

Will it? Can you write a compliant program that can detect if it
doesn't?

Post by p***@ipal.net
| Right, but we cannot tell what it should or must do, because it only
| has to do that for a program that *knows* it is in that situation.
It's a perfectly definable situation.

Then please, write a compliant program that creates that situation. Go
ahead. Do it. You can't.

It's not a perfectly definable situation because it is impossible for
a compliant program to know it is in that situation.

Post by p***@ipal.net
| This is undefined behavior. It has caused security problems in the
| past.
Then maybe it needs to become a defined behavior.

As I've tried to explain, that's literally impossible.

Consider:

1) Thread A is about to call 'poll' on descriptor 5, a socket.
2) Thread B 'close's descriptor 5.
3) Thread C, an internal thread created by the platform's pthreads
implementation, runs on a timer and needs to open a descriptor to read
a library file. It gets descriptor 5 by pure coincidence.
4) Thread A enters 'poll' and gets a hit on descriptor 5 for read,
since a file is always ready for reading.

The standard *cannot* (literally, is incapable of) specifying what
must happen in a situation that cannot be reliably created. Really.

DS

p***@ipal.net

2008-02-06 04:47:06 UTC

Permalink

On Tue, 5 Feb 2008 13:23:45 -0800 (PST) David Schwartz <***@webmaster.com> wrote:
| On Feb 4, 8:48 pm, phil-news-***@ipal.net wrote:
|
|> | For example, suppose we allocate more memory than the system has. We
|> | (humans) know the platform is out of memory but the program cannot
|> | know that it is out of memory. So even if POSIX says something has to
|> | happen in this case, the 'as-if' rule allows the platform to not do
|> | that. The program has no way to know (and hence no *right* to know)
|> | that it is out of memory.
|
|> The malloc() call will return NULL. How is that so hard to define?
|
| Will it? Can you write a compliant program that can detect if it
| doesn't?

If what you are asking is if a program can detect that malloc() returned
NULL when it should not have, then maybe we need to just dismiss all of
POSIX and start over.

There are SO MANY things which a program CANNOT test for beyond what it
has been given through the interface being tested. If malloc() returns
NULL when memory really was available, sure we can say that malloc() in
that case is a failure. That we cannot, easily or at all, test if this
is a failure that is happening does not mean we cannot standardize what
we expect of malloc(). The value of standards is not so much about some
kind of perfection of logic and more about all of us operating with the
same idea of how it works.

|> | Right, but we cannot tell what it should or must do, because it only
|> | has to do that for a program that *knows* it is in that situation.
|
|> It's a perfectly definable situation.
|
| Then please, write a compliant program that creates that situation. Go
| ahead. Do it. You can't.

I've described what it will do. Try explaining how a program that works
as described would fail. Elsewhere you suggested a program cannot close
a descriptor being waited on. So what happens when it tries and the
close() syscall is performed? Will the be an error from close()? Is
there a standard errno code for it? Is this in POSIX?

| It's not a perfectly definable situation because it is impossible for
| a compliant program to know it is in that situation.

I still think it is. But I think it is clear enough from my description
that I feel there is no need to write code. However, I will write that
code if there is an agreement from the POSIX standards committee that if
my program works, my proposal becomes part of the standard. But if there
is no such process in the works, why should I go beyond describing a
clear scenario that most programmers with an understanding of threads
and Unix can understand?

|> | This is undefined behavior. It has caused security problems in the
|> | past.
|
|> Then maybe it needs to become a defined behavior.
|
| As I've tried to explain, that's literally impossible.
|
| Consider:
|
| 1) Thread A is about to call 'poll' on descriptor 5, a socket.
| 2) Thread B 'close's descriptor 5.
| 3) Thread C, an internal thread created by the platform's pthreads
| implementation, runs on a timer and needs to open a descriptor to read
| a library file. It gets descriptor 5 by pure coincidence.
| 4) Thread A enters 'poll' and gets a hit on descriptor 5 for read,
| since a file is always ready for reading.

That's not the scenario I made my proposal for.

| The standard *cannot* (literally, is incapable of) specifying what
| must happen in a situation that cannot be reliably created. Really.

Sure it can. It's a matter of willingness.

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2008-02-05-***@ipal.net |
|------------------------------------/-------------------------------------|

David Schwartz

2008-02-06 06:31:07 UTC

Permalink

Post by p***@ipal.net
If what you are asking is if a program can detect that malloc() returned
NULL when it should not have, then maybe we need to just dismiss all of
POSIX and start over.

No, the other way around. A program that can detect that malloc did
not return NULL when it should have. Such a program is impossible,
since there is no way the program can know that the implementation is
out of memory other than malloc returning NULL.

For example, suppose an implementation has an API that allows a
program to get memory but to promise to return it if the OS needs it
later. In this case, the platform may decide to not return NULL from
malloc even though it's out of memory because it knows it can suspend
the process and force another process to give up the memory later.

POSIX *cannot* make this illegal. It is literally impossible for it to
do so, because no circumstance would ever violate it. Since the only
way the program can tell the system is out of memory is by malloc
returning NULL, no compliant program could ever detect that the
platform had returned non-NULL from malloc even though it didn't have
memory.

The 'as-if' rule permits undetectable violations.

Any rule whose violation can never be detected can never be violated
-- undetectable violations are always permitted.

Post by p***@ipal.net
|> | Right, but we cannot tell what it should or must do, because it only
|> | has to do that for a program that *knows* it is in that situation.
|
|> It's a perfectly definable situation.
|
| Then please, write a compliant program that creates that situation. Go
| ahead. Do it. You can't.
I've described what it will do. Try explaining how a program that works
as described would fail.

There is no way you can know whether the 'close' occurs before or
after the 'poll'. So any rule that requires the implementation to
treat those two cases differently would never apply.

Post by p***@ipal.net
Elsewhere you suggested a program cannot close
a descriptor being waited on. So what happens when it tries and the
close() syscall is performed? Will the be an error from close()? Is
there a standard errno code for it? Is this in POSIX?

It's undefined behavior since there are numerous semantically-distinct
situations that could occur. Real-world platforms can even write data
to the wrong descriptor (and cause security problems) if an
application does this. The only solution in the POSIX world is for the
implementation not to do this.

It is semantically identical to accessing some memory in one thread
while another thread calls 'free' on it.

Post by p***@ipal.net
| It's not a perfectly definable situation because it is impossible for
| a compliant program to know it is in that situation.
I still think it is. But I think it is clear enough from my description
that I feel there is no need to write code. However, I will write that
code if there is an agreement from the POSIX standards committee that if
my program works, my proposal becomes part of the standard. But if there
is no such process in the works, why should I go beyond describing a
clear scenario that most programmers with an understanding of threads
and Unix can understand?

If a program cannot know that it is in case A, it cannot ever know
that it is entitled to defined behavior for case A. So it must always
except behavior for other cases. It's really that simple. If you
cannot prove you are not in case A, you cannot say that the behavior
for case A is incorrect.

Post by p***@ipal.net
|> | This is undefined behavior. It has caused security problems in the
|> | past.
|
|> Then maybe it needs to become a defined behavior.
|
| As I've tried to explain, that's literally impossible.
|
|
| 1) Thread A is about to call 'poll' on descriptor 5, a socket.
| 2) Thread B 'close's descriptor 5.
| 3) Thread C, an internal thread created by the platform's pthreads
| implementation, runs on a timer and needs to open a descriptor to read
| a library file. It gets descriptor 5 by pure coincidence.
| 4) Thread A enters 'poll' and gets a hit on descriptor 5 for read,
| since a file is always ready for reading.
That's not the scenario I made my proposal for.

You still don't get it. It doesn't matter what scenario you happen to
be in, as far as the standards are concerned. It only matters what you
can prove you are in. The standard is not about what happens to occur,
independent of the code. It's about what the code itself does and what
behavior it must get.

You cannot rule out the scenario above, it can always happen no matter
what you do. So anything that is legal in that scenario above is legal
in the scenarios you imagine.

It's like asking that malloc not return NULL because you somehow
externally know that you are not out of memory. POSIX does not permit
you to "somehow know" something. And, in practice, may real-world
implementations actually return NULL from 'malloc' even though they
really do have enough memory to satisfy the request.

The standard *cannot* prohibit this. The application must accept that
'malloc' can return NULL anytime the application cannot prove that
enough memory is available.

Post by p***@ipal.net
| The standard *cannot* (literally, is incapable of) specifying what
| must happen in a situation that cannot be reliably created. Really.
Sure it can. It's a matter of willingness.

It's impossible. It would be as impossible as the standard prohibiting
malloc from returning NULL if the platform is not out of memory. The
standard cannot specify what "out of memory" means in any way other
than within the API itself -- which is only if 'malloc' returns NULL.

The standard could only work this out by providing a way to prove you
are in the case you want to specify. For example, if POSIX added a
"getavailablememory" function, then you could know that the platform
has the memory, and you could require that 'malloc' not claim
otherwise.

Similarly, POSIX could add an atomic "release mutex and poll"
function. In that case, you could write a program that could know for
sure that it called "close" in one thread while another thread was
blocked in poll. But without that, the standard could only specify the
behavior with an exception to the "as-if" rule.

That would be extremely bizarre.

DS

p***@ipal.net

2008-02-06 14:34:59 UTC

Permalink

On Tue, 5 Feb 2008 22:31:07 -0800 (PST) David Schwartz <***@webmaster.com> wrote:
|
| phil-news-***@ipal.net wrote:
|
|
|> If what you are asking is if a program can detect that malloc() returned
|> NULL when it should not have, then maybe we need to just dismiss all of
|> POSIX and start over.
|
| No, the other way around. A program that can detect that malloc did
| not return NULL when it should have. Such a program is impossible,
| since there is no way the program can know that the implementation is
| out of memory other than malloc returning NULL.
|
| For example, suppose an implementation has an API that allows a
| program to get memory but to promise to return it if the OS needs it
| later. In this case, the platform may decide to not return NULL from
| malloc even though it's out of memory because it knows it can suspend
| the process and force another process to give up the memory later.
|
| POSIX *cannot* make this illegal. It is literally impossible for it to
| do so, because no circumstance would ever violate it. Since the only
| way the program can tell the system is out of memory is by malloc
| returning NULL, no compliant program could ever detect that the
| platform had returned non-NULL from malloc even though it didn't have
| memory.

The standard of an INTERFACE is about MEANING. If malloc returns NULL
then the implementation of malloc MEANT That it is saying there is no
memory (even if there is). If malloc returns an address, then it is
saying there *IS* memory, even if there is not.

Sure, we cannot detect if the implementation is giving us FALSE returns
and is lying to us. But that doesn't mean we can't define a MEANING
for these values.

| The 'as-if' rule permits undetectable violations.
|
| Any rule whose violation can never be detected can never be violated
| -- undetectable violations are always permitted.

So.

|> |> | Right, but we cannot tell what it should or must do, because it only
|> |> | has to do that for a program that *knows* it is in that situation.
|> |
|> |> It's a perfectly definable situation.
|> |
|> | Then please, write a compliant program that creates that situation. Go
|> | ahead. Do it. You can't.
|
|> I've described what it will do. Try explaining how a program that works
|> as described would fail.
|
| There is no way you can know whether the 'close' occurs before or
| after the 'poll'. So any rule that requires the implementation to
| treat those two cases differently would never apply.

In the test case there would be. It would get EBADF, and there would
be no creation of another descriptor that could get the same number by
the test program.

|> Elsewhere you suggested a program cannot close
|> a descriptor being waited on. So what happens when it tries and the
|> close() syscall is performed? Will the be an error from close()? Is
|> there a standard errno code for it? Is this in POSIX?
|
| It's undefined behavior since there are numerous semantically-distinct
| situations that could occur. Real-world platforms can even write data
| to the wrong descriptor (and cause security problems) if an
| application does this. The only solution in the POSIX world is for the
| implementation not to do this.
|
| It is semantically identical to accessing some memory in one thread
| while another thread calls 'free' on it.

So that means we are not allowed to have threads? I think not.

This thread is getting way too long and TOTALLY POINTLESS. Either you have
failed to explain your case well enough to overcome the obvious opposite,
or it's just wrong. I'm now out of here. I'm done with this thread. I
do not have the time to keep up these circles.

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2008-02-06-***@ipal.net |
|------------------------------------/-------------------------------------|

Dildo Bogumil di Boscopelo

2008-02-06 14:42:42 UTC

Permalink

Post by p***@ipal.net
This thread is getting way too long and TOTALLY POINTLESS. Either you have
failed to explain your case well enough to overcome the obvious opposite,
or it's just wrong. I'm now out of here. I'm done with this thread. I
do not have the time to keep up these circles.

i'm still wondering how you managed to go this far :)

--
SF

Facciamo quello per cui siamo stati addestrati, per cui siamo stati cresciuti,
per cui siamo nati. Niente prigionieri, nessuna pietà. Inizio memorabile.

shebble

2008-02-06 19:42:06 UTC

Permalink

Post by Dildo Bogumil di Boscopelo

i'm still wondering how you managed to go this far :)

Ok, so defense of the standard and pure logic aside.. I just think it would be
nice if a man page somewhere said 'don't close() an fd when it is being watched
by select() or poll()'. <shrug>

If a programmer has to go over a standard and analyze EVERYTHING that ISNT
spelled out.. well that isnt going to happen with 100% accuracy. But neither can
we write, or have we ever written, a standard that spells out everything. We
would still be working on it now. Ahh now im going in circles too.

David Schwartz

2008-02-06 21:04:25 UTC

Permalink

Post by shebble
Ok, so defense of the standard and pure logic aside.. I just think it would be
nice if a man page somewhere said 'don't close() an fd when it is being watched
by select() or poll()'. <shrug>

There is really no need to warn people not to do the impossible. It is
impossible to write a program that is guaranteed to close an fd while
it is being watched by select or poll. There will always be a window
where the behavior is undefined.

Post by shebble
If a programmer has to go over a standard and analyze EVERYTHING that ISNT
spelled out.. well that isnt going to happen with 100% accuracy. But neither can
we write, or have we ever written, a standard that spells out everything. We
would still be working on it now. Ahh now im going in circles too.

That is why the most important thing to understand about standards is
the 'as-if' clause, what it means, and how it works.

For example, if the only way you can tell how much memory a platform
has is by 'malloc' returning NULL, then the standard can never require
'malloc' to return NULL if the platform is out of memory. Why? Because
the definition of "out of memory" (as far as the program is concerned)
is "malloc returns NULL". So you would be asking the standard to
guarantee the impossible doesn't happen (malloc fails to return NULL
when malloc returns NULL).

The standards only define the subset of system behavior that is
visible to a compliant application. If an application cannot tell if
it is in case A or case B with certainty, then the behavior for both
case A and for case B must be permitted. If either case invokes
undefined behavior, then both do.

On the flip side, it might be handy to have a "release mutex and poll"
function. That would make this defined behavior and permit the
standard to specify what happens in this case.

DS

p***@ipal.net

2008-02-07 03:06:34 UTC

Permalink

On Wed, 06 Feb 2008 12:42:06 -0700 shebble <***@example.com> wrote:
| Dildo Bogumil di Boscopelo wrote:
|> phil-news-***@ipal.net wrote:
|>
|>> This thread is getting way too long and TOTALLY POINTLESS. Either you
|>> have
|>> failed to explain your case well enough to overcome the obvious opposite,
|>> or it's just wrong. I'm now out of here. I'm done with this thread. I
|>> do not have the time to keep up these circles.
|>>
|>
|> i'm still wondering how you managed to go this far :)
|>
|
| Ok, so defense of the standard and pure logic aside.. I just think it would be
| nice if a man page somewhere said 'don't close() an fd when it is being watched
| by select() or poll()'. <shrug>

As it stands now, and is unlikely to change, is that the standard does not
say what will happen. That leaves anything open, such as causing a power
outage (very odd, but compliant).

I suspect if I proposed that POSIX specify that a compliant implementation
MUST NOT cause a power failure, we'd end up with yet another long thread
of a debate.

| If a programmer has to go over a standard and analyze EVERYTHING that ISNT
| spelled out.. well that isnt going to happen with 100% accuracy. But neither can
| we write, or have we ever written, a standard that spells out everything. We
| would still be working on it now. Ahh now im going in circles too.

We can spell out what things MEAN. Standards for communications protocols
seem to do a better job of this than standards for operating system calls.
Consider that "standard" of the English language and the phrase "I did not
have sexual relations with that woman". I see that sentence as compliant
with the language, even though I didn't believe the famous speaker of those
words.

Take a look at TCP. There is a race condition in which both ends of a
connection cannot be 100% sure than the other end knows the connection
is ended when it gets shutdown. But did that stop the development of
TCP? Of course not.

--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2008-02-06-***@ipal.net |
|------------------------------------/-------------------------------------|

Rainer Weikusat

2008-02-05 10:56:40 UTC

Permalink

Post by shebble
On Linux, lets say 2.6 kernel...
Can you close a file descriptor in one thread while another thread is
using it in a select(), poll(), or epoll_wait() ?
What will happen?

You won't get events reported for this descriptor anymore because the
associated file table entry has been cleared. You may get EBADF
instead of a non-error return. If another thread executes a system
call creating a new file descriptor in between, you may get events
reported for other 'I/O entities' than those the multiplexing call was
likely intended to work on.

shebble

2008-02-05 19:29:32 UTC

Permalink

Post by Rainer Weikusat

Post by shebble
On Linux, lets say 2.6 kernel...
Can you close a file descriptor in one thread while another thread is
using it in a select(), poll(), or epoll_wait() ?
What will happen?

Ok, I wont do it. :)

This all started because im slowly writing a gui serial port utility to better
learn linux internals. Earlier I wanted to know how I should have a worker
thread wait on a file descriptor, and also be able to wake up the thread and
give it commands. It was suggested here that I use a pipe and wait on that also.
Then use the pipe to send the worker thread commands.

It works well. But I wondered.. I really just want my worker thread to die when
the file descriptor is closed.. what if I just close() it. Well now I know. So
I'm back on the PIPE! er..