Skip to content

Show h5 files as complete series on Windows#1650

Open
barentine wants to merge 1 commit into
python-microscopy:masterfrom
barentine:clusterlistwindows
Open

Show h5 files as complete series on Windows#1650
barentine wants to merge 1 commit into
python-microscopy:masterfrom
barentine:clusterlistwindows

Conversation

@barentine
Copy link
Copy Markdown
Member

Addresses issue Posix systems show h5 as complete series, while Windows lists them only as series, but incomplete.

Is this a bugfix or an enhancement?
bugfix
Proposed changes:

  • mark h5 series in clusterListing as finished spooling on Windows, to match posix behavior.
image

to

image

I am having issues localizing an h5 file on Windows. This doesn't solve that, but seems worth fixing up.

@David-Baddeley
Copy link
Copy Markdown
Contributor

There's a bit of an issue here in that we don't have a way of knowing a-priori if it's complete or not (we'd probably need to somehow check if another process has an open handle on it). This might be an issue on both win and nix. Easiest (if very inelegant) way of solving this might be to write a sentinel file once the spooling was complete (or chmod the file to read-only).

@David-Baddeley
Copy link
Copy Markdown
Contributor

IE this change will make it appear complete even while still spooling.

@David-Baddeley
Copy link
Copy Markdown
Contributor

An alternative option would be to check the modification data and call it complete if mod time is > 1 hr old or something (but this adds unwanted overhead).

@barentine
Copy link
Copy Markdown
Member Author

I'll check when we spool to h5, when we write the events table. Would be nice if that was consistent with PZF spooling and we can use its presence as a flag for completion.

@David-Baddeley
Copy link
Copy Markdown
Contributor

We used to write the events as we went (incrementally) - I'm not sure if I changed that with the PZF in H5 changes.

Are we already opening the file to get the frame number? Obviously opening the file is going to have some performance overhead.

@David-Baddeley
Copy link
Copy Markdown
Contributor

If we are opening the file, I think the hdf spool saves an EndTime to the metadata which could be used as a completion flag.

@David-Baddeley
Copy link
Copy Markdown
Contributor

No, I see it now, num_frames is the filesize in bytes for .h5.

@barentine
Copy link
Copy Markdown
Member Author

If we are opening the file, I think the hdf spool saves an EndTime to the metadata which could be used as a completion flag.

One of the things I'm hoping to fix is running on not-PYMEAcquired data. I.e. if I converted a tif stack to h5, and want to run localization on it. It likely won't have the EndTime

@David-Baddeley
Copy link
Copy Markdown
Contributor

Fair point:

I think we need some kind of flag which is cheap to check. Options I can think of:

  • Change permissions of file to read-only once spooling complete (nice in that it enforces our write once assumption, fairly quick to check, potentially a bit of a pain if you want to delete the file)
  • Drop an accompanying file when complete (or when saving during conversion). Not backwards compatible, and adds clutter.
  • Drop an _inprogress file while we are spooling and delete afterwards. (Not terribly robust - doesn't get deleted if things go wrong).

I don't really like dropping additional files as flags, however.

@barentine
Copy link
Copy Markdown
Member Author

I think I've got an idea for a non-filedrop option:
If we're spooling to an h5 file, then a dataserver has to have that file open. Since clusterListing._file_info gets called on the server side, we can just import h5rfile and check whether the file is in the cache, and if the file is still open. The 120 s keep alive is a little long, but since the file is already open checking for EndTime is no additional I/O.
Haven't tried it yet, but does that seem preferable?

@David-Baddeley
Copy link
Copy Markdown
Contributor

I'm a little worried that the round-trip is still going to be a bit high - directory listing is supposed to be fast - but it could work.

@David-Baddeley
Copy link
Copy Markdown
Contributor

I'm wondering if the easiest option isn't just to check the modification time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants