Computer Security: A Machine Learning Approach

Sandeep V. Sabnani

(2008)

Sandeep V. Sabnani (2008) Computer Security: A Machine Learning Approach.

Our Full Text Deposits

Full text access: Open

Full Text - 783.16 KB

Links to Copies of this Item Held Elsewhere


Abstract

In this thesis, we present the application of machine learning to computer security, particularly to intrusion detection. We analyse two learning algorithms (NBTree and VFI) for the task of detecting intrusions and compare their relative performances. We then comment on the suitability of the NBTree algorithm for the intrusion detection task based on its high accuracy and high recall. We finally state the usefulness of machine learning to the field of computer security and also comment on the security of machine learning itself.

Information about this Version

This is a Published version
This version's date is: 07/01/2008
This item is peer reviewed

Link to this Version

https://repository.royalholloway.ac.uk/items/eb400e6b-efbd-8729-78e9-ae1e787835c3/1/

Item TypeMonograph (Technical Report)
TitleComputer Security: A Machine Learning Approach
AuthorsSabnani, Sandeep V.
DepartmentsFaculty of Science\Mathematics

Deposited by () on 24-Jun-2010 in Royal Holloway Research Online.Last modified on 15-Dec-2010

Notes

References

[AKA91] David W. Aha, Dennis Kibler, and Marc K. Albert. Instance-based
learning algorithms. Mach. Learn., 6(1):37–66, January 1991.

[AN07] A. Asuncion and D.J. Newman. UCI machine learning repository.
http://www.ics.uci.edu/~mlearn/MLRepository.html, 2007.

[AS94] Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for
mining association rules. In Jorge B. Bocca, Matthias Jarke, and
Carlo Zaniolo, editors, Proc. 20th Int. Conf. Very Large Data Bases,
VLDB, pages 487–499. Morgan Kaufmann, 12–15 1994.

[Bac99] Rebecca G. Bace. Intrusion Detection. Sams, December 1999.

[BCH+01] Eric Bloedorn, Alan D. Christiansen, Willian Hill, Clement Skorupka,
Lisa M. Talbot, and Jonathan Tivel. Data mining for network intrusion
detection: How to get started. http://citeseer.ist.psu.
edu/bloedorn01data.html, Aug 2001.

[BFSO84] Leo Breiman, Jerome Friedman, Charles J. Stone, and R. A. Olshen.
Classification and Regression Trees. Chapman & Hall/CRC, January
1984.

[BH95] Philippe Besnard and Steve Hanks, editors. UAI ’95: Proceedings
of the Eleventh Annual Conference on Uncertainty in Artificial Intelligence,
August 18-20, 1995, Montreal, Quebec, Canada. Morgan
Kaufmann, 1995.

[BKNS00] Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, and J¨org
Sander. Lof: identifying density-based local outliers. SIGMOD Rec.,
29(2):93–104, 2000.

[BNS+06] Marco Barreno, Blaine Nelson, Russell Sears, Anthony D. Joseph,
and J. D. Tygar. Can machine learning be secure? In ASIACCS ’06:
Proceedings of the 2006 ACM Symposium on Information, computer
and communications security, pages 16–25, New York, NY, USA,
2006. ACM Press
.
[Cen87] Jadzia Cendrowska. Prism: An algorithm for inducing modular rules.
International Journal of Man-Machine Studies, 27(4):349–370, 1987.

[CER05] Insider threat study:computer system sabotage in critical infrastructure
sectors. http://www.cert.org/archive/pdf/
insidercross051105.pdf, 2005.

[CER07] CERT Vulnerability Statistics 1995 - 2006. http://www.cert.org/
stats/vulnerability_remediation.html, 2007.

[Coh95] William W. Cohen. Fast effective rule induction. In Armand Prieditis
and Stuart Russell, editors, Proc. of the 12th International Conference
on Machine Learning, pages 115–123, Tahoe City, CA, July
9–12, 1995. Morgan Kaufmann.

[Cra06] Jason Crampton. Notes on Computer Security, 2006.

[CW87] David D. Clark and David R. Wilson. A comparison of commercial
and military computer security policies. IEEE Security and Privacy,
00:184, 1987.

[DG97] Gulsen Demiroz and H. Altay Guvenir. Classification by voting feature
intervals. In European Conference on Machine Learning, pages
85–92, 1997.

[DL03] Tom Dietterich and Pat Langley. Machine learning for cognitive
networks:technology assessments and research challenges, Draft
of May 11, 2003. http://web.engr.oregonstate.edu/~tgd/kp/
dl-report.pdf, 2003.

[EEL+] Levent Ertz, Eric Eilertson, Aleksandar Lazarevic, Pang-Ning Tan,
Vipin Kumar, Jaideep Srivastava, and Paul Dokas. Minds -
minnesota intrusion detection system. http://www.cs.umn.edu/
research/MINDS/papers/minds_chapter.pdf.

[EPY97] Eppstein, Paterson, and Yao. On nearest neighbor graphs. GEOMETRY:
Discrete & Computational Geometry, 17, 1997.

[FHSL96] Stephanie Forrest, Steven A. Hofmeyr, Anil Somayaji, and
Thomas A. Longstaff. A sense of self for Unix processes. In Proceedinges
of the 1996 IEEE Symposium on Research in Security and
Privacy, pages 120–128. IEEE Computer Society Press, 1996.

[FLSM00] Wei Fan, Wenke Lee, Salvatore J. Stolfo, and Matthew Miller. A
multiple model cost-sensitive approach for intrusion detection. In
Machine Learning: ECML 2000, 11th European Conference on Machine
Learning, Barcelona, Catalonia, Spain, May 31 - June 2, 2000,
Proceedings, volume 1810, pages 142–153. Springer, Berlin, 2000.

[FS99] Yoav Freund and Robert E. Schapire. Large margin classification
using the perceptron algorithm. Machine Learning, 37(3):277–296,
December 1999.

[Ges97] Paul Gestwicki. Id3: History, implementation, and applications.
http://citeseer.ist.psu.edu/gestwicki97id.html, 1997.

[Gol99] Dieter Gollmann. Computer Security. John Wiley & Sons, 1999.

[GSS99] Anup K. Ghosh, Aaron Schwartzbard, and Michael Schatz. Learning
program behavior profiles for intrusion detection. In ID’99: Proceedings
of the 1st conference on Workshop on Intrusion Detection and
Network Monitoring, pages 6–6, Berkeley, CA, USA, 1999. USENIX
Association.

[Hal99] Mark A. Hall. Correlation-based Feature Selection for Machine
Learning. PhD thesis, University of Waikato, Department of Computer
Science, 1999.

[Hol93] Robert C. Holte. Very simple classification rules perform well on most
commonly used datasets. Machine Learning, 11(1):63–90, April 1993.

[HS96] M. Hall and L. Smith. Practical feature subset selection for machine
learning. In Proceedings of the Australian Computer Science Conference,
1996.

[IYWL06] Doo Heon Song Ill-Young Weon and Chang-Hoon Lee. Effective intrusion
detection model through the combination of a signature-based
intrusion detection system and a machine learning-based intrusion
detection system. Journal of Information Science and Engineering,
22(6):1447–1464, 2006.

[JL95] George H. John and Pat Langley. Estimating continuous distributions
in bayesian classifiers. In Proceedings of the Eleventh Conference
on Uncertainty in Artificial Intelligence, pages 338–345, 1995.

[Ken99] K. Kendall. A database of computer attacks for the evaluation
of intrusion detection systems. http://www.kkendall.org/files/
thesis/krkthesis.pdf, 1999.

[KM97] Miroslav Kubat and Stan Matwin. Addressing the curse of imbalanced
training sets: one-sided selection. In Proc. 14th International
Conference on Machine Learning, pages 179–186. Morgan Kaufmann,
1997.

[Koh95] Ron Kohavi. A study of cross-validation and bootstrap for accuracy
estimation and model selection. In Proceedings of the Fourteenth
International Joint Conference on Artificial Intelligence, pages 1137–
1145, 1995.

[Koh96] Ron Kohavi. Scaling up the accuracy of Naive-Bayes classifiers: a
decision-tree hybrid. In Proceedings of the Second International Conference
on Knowledge Discovery and Data Mining, pages 202–207,
1996.

[KT03] Christopher Kruegel and Thomas Toth. Using decision trees to
improve signature-based intrusion detection. http://www.auto.
tuwien.ac.at/~chris/research/doc/2003_03.ps, 2003.

[Lan00] Terran D. Lane. Machine Learning Techniques for the computer security
domain of anomaly detection. PhD thesis, Department of Electrical
and Computer Engineering, Purdue University, August 2000.

[LB97a] T. Lane and C. Brodley. Detecting the abnormal: Machine learning
in computer security. citeseer.ist.psu.edu/lane97detecting.
html, 1997.

[LB97b] T. Lane and C. E. Brodley. An application of machine learning to
anomaly detection. In Proc. 20th NIST-NCSC National Information
Systems Security Conference, pages 366–380, 1997.

[Lia05] Yihua Liao. Machine Learning in Intrusion Detection. PhD thesis,
University of California (Davis), Department of Computer Science,
2005.

[Lit88] Nick Littlestone. Learning quickly when irrelevant attributes abound:
A new linear-threshold algorithm. Machine Learning, 2(4):285–318,
1988.

[LIT92] Pat Langley, Wayne Iba, and Kevin Thompson. An analysis of
bayesian classifiers. In National Conference on Artificial Intelligence,
pages 223–228, 1992.

[LS00] Wenke Lee and Salvatore J. Stolfo. A framework for constructing
features and models for intrusion detection systems. Information
and System Security, 3(4):227–261, 2000.

[Mah03] M. Mahoney. A Machine Learning Approach to Detecting Attacks
by Identifying Anomalies in Network Traffic. PhD thesis, Florida
Institute of Technology, 2003.

[Mal06] Marcus A. Maloof, editor. Machine Learning and Data Mining for
Computer Security. Springer, 2006.

[Mat00] Jiri Matousek. On approximate geometric k-clustering. Discrete &
Computational Geometry, 24(1):61–84, 2000.

[MC02] M. Mahoney and P. Chan. Learning models of network traffic
for detecting novel attacks. http://www.cs.fit.edu/~mmahoney/
paper5.pdf, 2002.

[MCM83] R. S. Michalski, J. G. Carbonell, and T. M. Mitchell, editors. Machine
Learning: An Artificial Intelligence Approach. Tioga Publishing
Company, 1983.

[MH03] Steve Moyle and John Heasman. Machine learning to detect intrusion
strategies. Knowledge-Based Intelligent Information and Engineering
Systems, 2773/2003:371–378, 2003.

[Mit97] Tom M. Mitchell. Machine Learning. McGraw Hill, 1997.

[MKSW99] J. Makhoul, F. Kubala, R. Schwartz, and R. Weischedel. Performance
measures for information extraction. http://www.nist.gov/
speech/publications/darpa99/html/dir10/dir10.htm, 1999.

[MM01] Ludovic M’e and C’edric Michel. Intrusion detection: A bibliography.
Technical Report SSIR-2001-01, Sup’elec, Rennes, France, September
2001.

[Mos05] Tim Mose. Oasis, extensible access control markup language, (xacml)
version 2.0. http://docs.oasis-open.org/xacml/2.0/access_
control-xacml-2.0-core-spec-os.pdf, 2005.

[MX06] Evan Martin and Tao Xie. Inferring access-control policy properties
via machine learning. In POLICY ’06: Proceedings of the Seventh
IEEE International Workshop on Policies for Distributed Systems
and Networks (POLICY’06), pages 235–238, Washington, DC, USA,
2006. IEEE Computer Society.

[Nil96] Nils J.. Nilsson. Introduction to Machine Learning - an early
draft of a proposed book. http://ai.stanford.edu/~nilsson/
MLDraftBook/MLBOOK.pdf, 1996.

[NIS85] NIST. Trusted computer system evaluation criteria (orange book).
http://csrc.nist.gov/publications/history/dod85.pdf, 1985.

[OC99a] University Of California. Intrusion detection dataset in machine
readable form. http://kdd.ics.uci.edu/databases/kddcup99/
kddcup.names, 1999.

[OC99b] University Of California. The UCI KDD Archive, University of
California. http://kdd.ics.uci.edu/databases/kddcup99/task.
html, 1999.

[OC99c] University Of California. The UCI KDD Archive, University
of California. http://kdd.ics.uci.edu/databases/kddcup99/
kddcup99.html, 1999.

[Pie04] Tadeusz Pietraszek. Using adaptive alert classification to reduce false
positives in intrusion detection. Recent Advances in Intrusion Detection,
3224:102–124, 2004.

[PP03] Charles P. Pfleeger and Shari Lawrence Pfleeger. Security in Computing.
Pearson Education, Inc, 2003.

[PP07] Animesh Patcha and Jung-Min Park. Network anomaly detection
with incomplete audit data. Computer Networks: The International
Journal of Computer and Telecommunications Networking,
51(13):3935–3955, 2007.

[PT05] Tadeusz Pietraszeka and Axel Tannera. Data mining and machine
learning—towards reducing false positives in intrusion detection. Information
Security Technical Report, 10(3):169–183, 2005.

[Qui93] Ross R. Quinlan. C4.5: programs for machine learning. Morgan
Kaufmann Publishers Inc., 1993.

[Ren04] Jason D. M. Rennie. Derivation of the f-measure. http://people.
csail.mit.edu/jrennie/writing/fmeasure.pdf, Feb 2004.

[SJS00] Wenke Lee Salvatore J. Stolfo, Wei Fan. Cost-based modeling for
fraud and intrusion detection results from the jam project. http:
//www.cs.columbia.edu/~wfan/papers/costdisex.ps.gz, 2000.

[SL06] Surendra K. Singhi and Huan Liu. Feature subset selection bias
for classification learning. In ICML ’06: Proceedings of the 23rd
international conference on Machine learning, pages 849–856, New
York, NY, USA, 2006. ACM Press.

[SO04] Shengli Sheng and Sylvia L. Osborn. A classifier-based approach to
user-role assignment for web applications. In Secure Data Management,
pages 163–171, 2004.

[SS03] Maheshkumar Sabhnani and Gursel Serpen. Application of machine
learning algorithms to kdd intrusion detection dataset within misuse
detection context. In Proceedings of International Conference on
Machine Learning: Models, Technologies, and Applications, pages
209–215, Las Vegas, Nevada, USA, 2003.

[Sta06] William Stallings. Network Security Essentials: Applications and
Standards (3rd Edition). Prentice Hall, 2006.

[TC05] G. Tandon and P. Chan. Learning useful system call attributes for
anomaly detection. Proc. 18th Intl. FLAIRS Conf., pages 405–410,
2005.

[Tes07] Sebastiaan Tesink. Improving intrusion detection systems through
machine learning. http://ilk.uvt.nl/downloads/pub/papers/
thesis-tesink.pdf, 2007.

[VMV05] Fredrik Valeur, Darren Mutz, and Giovanni Vigna. A learning-based
approach to the detection of SQL attacks. In DIMVA, pages 123–140,
2005.

[WF05] Ian H. Witten and Eibe Frank. Data Mining - Practical Machine
Learning Tools and Techniques, Second Edition. Elsevier, 2005.

[WMB99] Ian H. Witten, Alistair Moffat, and Timothy C. Bell. Managing Gigabytes:
Compressing and Indexing Documents and Images. Morgan
Kaufmann Publishers, San Francisco, CA, 1999.

[Wol06] StephenWolthusen. Lecture 11 - Intrusion Detection and Prevention,
notes in Network Security, 2006.

[WS02] D. Wagner and P. Soto. Mimicry attacks on host based intrusion
detection systems. http://www.cs.berkeley.edu/~daw/papers/
mimicry.pdf, 2002.

[WZA06] Nigel Williams, Sebastian Zander, and Grenville Armitage. A preliminary
performance comparison of five machine learning algorithms for
practical ip traffic flow classification. SIGCOMM Comput. Commun.
Rev., 36(5):5–16, 2006.


Details