Ticket #70 (reopened defect)

Opened 7 months ago

Last modified 6 months ago

too many open files

Reported by: r1chardj0n3s Owned by: matt
Priority: critical Milestone:
Component: General Version:
Keywords: Cc: richardjones@…, stephane.demurget@…

Description

I'm indexing a some large files by opening the index and creating a new writer for each file I index and then committing after each file is indexed.

The output of lsof from before it crashes is attached. There appears to be quite a few files opened multiple times.

Attachments

lsof-out.txt (19.0 KB) - added by r1chardj0n3s 7 months ago.
Output of lsof
lumberjack.py (11.7 KB) - added by r1chardj0n3s 7 months ago.
processing program

Change History

Changed 7 months ago by r1chardj0n3s

Output of lsof

Changed 7 months ago by r1chardj0n3s

  • cc richardjones@… added
  • priority changed from major to critical

BTW I attempted the fix from bug #45 and it didn't help.

I've modified my code to commit after every 10,000 lines of input processed and that didn't help.

I can provide my program if that'll help - I just can't provide the input data.

Changed 7 months ago by matt

  • status changed from new to assigned

Yes please, if you could attach or send the program you're using to index, that would be very helpful.

Changed 7 months ago by r1chardj0n3s

  • status changed from assigned to closed
  • resolution set to worksforme

Ah, I think I got it: I was creating a writer for each file being processed. I seem to recall that I'm doing this because previously I couldn't commit() multiple times for a given writer - perhaps this is something that's changed recently.

Changed 7 months ago by r1chardj0n3s

processing program

Changed 7 months ago by r1chardj0n3s

  • status changed from closed to reopened
  • resolution worksforme deleted

Darn, I spoke too soon. It just crashed again after processing quite a few (64) files.

Changed 7 months ago by r1chardj0n3s

Since the program is written to be restartable I have done so and processing continues - thus leading me to believe it's an actual leak.

Changed 6 months ago by zzrough

The development has switched to bitbucket and the first part of this bug has been fixed by Matt when importing into bitbucket. The first part of the bug being:

"I'm indexing a some large files by opening the index and creating a new writer for each file I index and then committing after each file is indexed."

I've explained this in http://bitbucket.org/mchaput/whoosh/issue/9.

The main lock fd was not closed. Anyway, from your lsof output, you are hitting a segment fd leak this time.

I'd like to fix the bug on my side quickly: can you give me the whoosh version you are using and when you actually launched lsof? At the middle of the indexing?

Thanks a lot!

Changed 6 months ago by zzrough

  • cc stephane.demurget@… added

Changed 6 months ago by r1chardj0n3s

I'm using 0.3.16 and I ran lsof during indexing.

Note: See TracTickets for help on using tickets.