Welcome to PhiloLogic Friday, May 09 2008 @ 07:49 AM CDT  
home |  the ARTFL project |  DLDC |  download |  user documentation |  developer |  bugs |  to do |  sample databases |  encoding |  contact |  wiki |   
PhiloLogic To-Do List

To-Do List

This is a constantly growing list of things that we intend to fix for future revisions of PhiloLogic.


Add a routine to newextract (the bibliography generator) check to see if each file has the basic elements required for loading as a TEI/ATE or other file. This would check for a DIV level object, a P level object, and possibly for some CDATA contents. More later... This is to resolve the fact that one can have directories of XML files that will contain a few headers, including, and other stuff....
Martin Mueller (Oct 10, 2005) suggests a random hit function:
I think that adding a random sample  feature would be very useful.  
For any set of returns that runs in the hundreds, not to speak  
thousands, it would be a terrific first orientation to have a random  
sample. It might have a minimum size--say 50, and then increase as a  
fraction of the total size until the sample size is such that  
increasing the sample size won't add much.

As an example, I have a student who is interested in figuring out the  
relationship of cognitive and ethical meanings in 'true' from Chaucer  
to Shakespeare. There are close to 20,000 occurrences of tr[vu]e in  
that period. For her, a random sample of 1,500 would let her figure  
out in a day where the action is.


Not a TODO, but a kewl idea from Orion that I don't want to loose track of....:

Somebody from the New York Times is asking people to submit addresses
of things from books, for them to add to a map of Places Mentioned In
Books, a "literary map of manhattan".

http://www.nytimes.com/2005/05/01/books/review/01COHENHO.html?ex=1272600000&en=9
093cefdfcdb6409&ei=5090&partner=rssuserland&emc=rss
(tinyurl: http://tinyurl.com/9ew8h)

Some of these have addresses ("The Talented Mr. Ripley") but most of
them don't.  A fun project -- for someone with lots of text and a fast
search engine and Google Maps -- would be to map all of this
automatically, parsing out addresses or intersections or what have
you.  Though of course it would be impossible to get everything that a
human could.

New results format: map geographically.
Note: I gave a very general talk on "Mapping Textuality" a few years ago. I would love to do this. Very interesting.....something to think about.
IWW style requeries, to re-present results in different ways, giving the user a "filter" (Julia's expression) approach to result sets. Russ and I are thinking of a dynamic results header as a drop-in block of code, which would keep LATENTQUERYSTRING on the server and parse it for different result sets.
Carole Mah sez: Put something about the sort order of basic text search results. These are in LOAD order. The general loader tries to sort out the load in chronological order (year only). This could simply be put in philosubs..... Oddly enuff, we won't always know about that. Geez, you wudda thunk we would have something like that, eh?
Create the philohistory directory either on install or when the Philo history function is run and does not find one. Check to see if it reads the PHILOTMP directive.
Add sort by frequency to Terms button? There may be speed problems with this. And I don't have a good idea about how to put the switch in the interface (a global selection)?
02-22-05: Allow specification for the sort -T location in general philo configuration, which will probably avoid the next. Note that it can be set in loader.xmake changing
SORTFLAGS= -T . -y +0 -1 +1 -2n +2 -3n +3 -4n +4 -5n +5 -6n +6 -7n +7 -8n
to some other location with lots of space.
SORTFLAGS= -T /export/home/thymephilo/mark/temp/philosort/ -y +0 -1 +1 -2n +2 -3n +3 -4n +4 -5n +5 -6n +6 -7n +7 -8n

02-22-05: Trap for no space on device error on load. If we get this as we are reading texts in, it simply stops loading the offending batch and in certain circumstances will load the database without noticing it is missing a batch.
Loading 999 ===> TEXTS/pharisjn.xml... 
/usr/local/bin/sort: write failed: ./sortU6aO_m: No space left on device

02-22-05: and while we're at it, let's encourage a default database directory that is NOT in the standard install location (/var/lib/philologic/databases/).
Created this page in 0.04 seconds


 Copyright © 2008 The University of Chicago
 PhiloLogic™ is a registered trademark of the University of Chicago.
All other trademarks and copyrights on this page are owned by their respective owners.