JustResearchsm.gif (596 bytes) Just Research ________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________ Home People Publications Cora Search Engine Latest Publications This page presents a selection of Just Research's latest publications in a variety of research areas. The titles listed below are hyperlinks to abstracts; online papers are available in several popular formats. You may also want to try the Cora Computer Science Research Paper Search Engine, one of the publicly available examples of Just Research technology. ________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________ Agent Architectures * JGram: Rapid Development of Multi-Agent Pipelines for Real-World Tasks. Rahul Sukthankar, Antoine Brusseau, Ray Pelletier, Robert Stockton. Proceedings of ASA-1999. [ps 361K] [ps.gz 125K] Cora Search Engine * Automating the Construction of Internet Portals with Machine Learning. Andrew McCallum, Kamal Nigam, Jason Rennie and Kristie Seymore. Submitted for journal publication. [ps.gz 3.5MB] * Using Reinforcement Learning to Spider the Web Efficiently. Jason Rennie and Andrew McCallum. Proceedings of the Sixteenth International Conference on Machine Learning (ICML-99). 1999. [ps.gz 294K] * A Machine Learning Approach to Building Domain-Specific Search Engines. Andrew McCallum, Kamal Nigam, Jason Rennie, and Kristie Seymore. The Sixteenth International Joint Conference on Artificial Intelligence (IJCAI-99). 1999. [ps 1.4MB] * Learning Hidden Markov Model Structure for Information Extraction. Kristie Seymore, Andrew McCallum, and Ronald Rosenfeld. In AAAI-99 Workshop on Machine Learning for Information Extraction. 1999. [ps 133K] * Text Classification by Bootstrapping with Keywords, EM and Shrinkage . Andrew McCallum and Kamal Nigam. In ACL '99 Workshop for Unsupervised Learning in Natural Language Processing. 1999. [ps 177K] * Building Domain-Specific Search Engines with Machine Learning Techniques. Andrew McCallum, Kamal Nigam, Jason Rennie and Kristie Seymore. AAAI-99 Spring Symposium. (A related paper will also appear in IJCAI-99.) [ps 2.5MB] [ps.gz 188K] Information Retrieval and Statistical Language Modeling * Augmenting Learning Classifiers with Hidden Markov Models. Ray Pelletier. Submitted to ICML 2000. [ps 375K] * * Information Extraction with HMMs and Shrinkage. Dayne Freitag and Andrew McCallum. AAAI-99 Workshop on Machine Learning for Information Extraction. [ps 201K] [ps.gz 78K] * A Hierarchical Probabilistic Model for Novelty Detection in Text. L. Douglas Baker, Thomas Hofmann, Andrew McCallum, Yiming Yang. Submitted to NIPS'99. [ps 158K] [ps.gz 66K] * Multi-Label Text Classification with a Mixture Model Trained by EM. Andrew McCallum. Submitted to NIPS'99. [ps.gz 131K] * Using Maximum Entropy for Text Classification. Kamal Nigam, John Lafferty, Andrew McCallum. IJCAI'99 Workshop on Information Filtering. [ps.gz 135K] * Learning to Classify Text from Labeled and Unlabeled Documents. Kamal Nigam, Andrew McCallum, Sebastian Thrun and Tom Mitchell. Machine Learning Journal, June 1999. [ps 384K] [ps.gz 88K] * A Comparison of Event Models for Naive Bayes Text Classification. Andrew McCallum and Kamal Nigam. AAAI-98 Workshop on "Learning for Text Categorization". [ps 410K] * Improving Text Classification by Shrinkage in a Hierarchy of Classes. Andrew McCallum, Ronald Rosenfeld, Tom Mitchell and Andrew Ng. ICML-98. [ps.gz 393K] * Employing EM in Pool-Based Active Learning for Text Classification. Andrew McCallum and Kamal Nigam. ICML-98. [ps.gz 423K] * Distributional Clustering of Words for Text Classification. L. Douglas Baker, Andrew McCallum. SIGIR-98, pp. 96-103. [ps 261K] [ps.gz 86K] * Learning to Extract Knowledge from the World Wide Web. Mark Craven, Dan DiPasquo, Dayne Freitag, Andrew McCallum, Tom Mitchell, Kamal Nigam, Sean Slattery. AAAI-98. [ps.gz 181K] Natural Language Processing * Selecting Text Spans for Document Summaries: Heuristics and Metrics. Vibhu Mittal, Mark Kantrowitz, Jade Goldstein and Jaime Carbonell. Proceedings of AAAI-99. [pdf 400K] [ps 480K] [ps.gz 230K] * Building High-Performance Name Spotters using Machine Learning. Shumeet Baluja, Vibhu Mittal and Rahul Sukthankar. Proceedings of PacLing-99. [pdf 2.2MB] [ps 3.3MB] [ps.gz 200K] * Generating Extraction Based Summaries from Handwritten Summaries by Aligning Text Spans. Michele Banko, Vibhu Mittal, Mark Kantrowitz and Jade Goldstein. Proceedings of PacLing-99. [pdf 200K] [ps 95K] [ps.gz 35K] * Ultra-Summarization: A Statistical Approach to Generating Highly Condensed Non-Extractive Summaries. Michael Witbrock and Vibhu Mittal. Proceedings of SIGIR-99. [pdf 330K] [ps 99K] [ps.gz 39K] Computer Vision * Memory-based Face Recognition for Visitor Identification . Terence Sim, Rahul Sukthankar, Matthew Mullin, Shumeet Baluja. (FG2000) [ps 751K] [ps.gz 206K] * ARGUS: An Automated Multi-agent Visitor Identification System. Rahul Sukthankar and Robert Stockton. Proceedings of AAAI-99. [ps 521K] [ps.gz 183K] * High-Performance Memory-based Face Recognition for Visitor Identification. Terence Sim, Rahul Sukthankar, Matthew Mullin, Shumeet Baluja. (tech report) [ps 783K] [ps.gz 240K] Optimization * Estimating the Number of Local Minima in Big, Nasty Search Spaces. Rich Caruana, Matthew Mullin. IJCAI-99 Workshop on Statistical Machine Learning for Large-Scale Optimization. [ps.gz 49K] [ps 242K] [pdf 49K] Cross Validation * Complete Cross-Validation for Nearest Neighbor Classifiers . Matthew Mullin, Rahul Sukthankar. (submitted) [ps.gz 47K] [ps 118K] [pdf 108K] * An Efficient Technique for Calculating Exact Nearest-Neighbor Classification Accuracy . Matthew Mullin, Rahul Sukthankar. (tech report) [ps.gz 38K] [ps 92K] This page courtesy: Rahul Sukthankar (rahuls@justresearch.com), last updated December 13, 1999 ________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________