Variable selection using presence-only data with applications to biochemistry
MIT Building E18, Room 304 Ford Building (E18), 50 Ames Street, Cambridge, MA, United StatesAbstract: In a number of problems, we are presented with positive and unlabelled data, referred to as presence-only responses. The application I present today involves studying the relationship between protein sequence and function and presence-only data arises since for many experiments it is impossible to obtain a large set of negative (non-functional) sequences. Furthermore, if…