dc.contributor.author |
Jreij, Georges Antoun |
|
dc.date.accessioned |
2016-03-04T09:48:39Z |
|
dc.date.available |
2016-03-04T09:48:39Z |
|
dc.date.copyright |
6/11/2013 |
en_US |
dc.date.issued |
2016-03-04 |
|
dc.identifier.uri |
http://hdl.handle.net/10725/3266 |
|
dc.description.abstract |
Classification consists of predicting group membership for new data instances by learning from pre-classified data instances. Classification is crucial as it contributes in solving problems in all fields, such as: bio-chemistry, social sciences, bioinformatics, etc. Classification has three main components: the classification algorithm, the pre-classified data (training data) and the un-classified data (testing data). Classification accuracy is a measure of how well a classification algorithm classifies the un-classified data. Several algorithms tackle this problem. Examples of such algorithms are C4.5, neural networks, Bayesian networks, etc. However, since algorithms do not perform equally on the same data, a detailed study of the “algorithm-data relationship” is needed to assess the overall performance of these algorithms rather than relying only on their accuracy. In order to rationalize this point of view, we will explore and assess eight classification algorithms on eight disease detection datasets with different characteristics each. A detailed comparative study will highlight the advantages and drawbacks of each algorithm. |
en_US |
dc.language.iso |
en |
en_US |
dc.subject |
Disease -- Classification |
en_US |
dc.subject |
Lebanese American University -- Dissertations |
en_US |
dc.subject |
Dissertations, Academic |
en_US |
dc.title |
Using machine learning for disease detection. (c2013) |
en_US |
dc.type |
Thesis |
en_US |
dc.title.subtitle |
a comparative study |
en_US |
dc.term.submitted |
Spring |
en_US |
dc.author.degree |
MS in Computer Science |
en_US |
dc.author.school |
SAS |
en_US |
dc.author.idnumber |
200402329 |
en_US |
dc.author.commembers |
Takche, Jean |
|
dc.author.commembers |
Khazen, George |
|
dc.author.woa |
OA |
en_US |
dc.author.department |
Computer Science and Mathematics |
en_US |
dc.description.embargo |
N/A |
en_US |
dc.description.physdesc |
1 hard copy: xix, 146 leaves; ill.; 30 cm. available at RNL. |
en_US |
dc.author.advisor |
Azar, Danielle |
|
dc.keywords |
Classification via clustering |
en_US |
dc.keywords |
Comparative study |
en_US |
dc.keywords |
Decision trees |
en_US |
dc.keywords |
Disease detection |
en_US |
dc.keywords |
K nearest neighbor |
en_US |
dc.keywords |
Logistic regression |
en_US |
dc.keywords |
Machine learning |
en_US |
dc.keywords |
Medical datasets |
en_US |
dc.keywords |
Multilayered perceptron |
en_US |
dc.keywords |
Naïve Bayes |
en_US |
dc.keywords |
Neural networks |
en_US |
dc.keywords |
Partial decision trees |
en_US |
dc.keywords |
Voting feature intervals |
en_US |
dc.description.bibliographiccitations |
Includes bibliographical references (leaves 138-146). |
en_US |
dc.identifier.doi |
https://doi.org/10.26756/th.2013.49 |
en_US |
dc.publisher.institution |
Lebanese American University |
en_US |