Sonalmogra
7 min readJun 5, 2021

--

HELLO AMIGOS!!

Task-05 Cyber Crime cases where they talk about “confusion matrix”

Task Discription — -

>>Create a blog/article/video about cyber crime cases where they talk about confusion matrix or its two types of error.

CYBER CRIME

Cyber Attack: It is a malicious and deliberate attempt by an individual or organization to breach the information system of another individual or organization.

Motives behind Cyber Attacks

  • To seek commercial gain by hacking banks and financial institutions.
  • To attack critical assets of a nation.
  • To penetrate into both corporate and military data servers to obtain plans and intelligence.
  • To hack sites to virally communicate a message for some specific campaign related to politics and society.

Types of Cyber Attacks

  • Malware, short for malicious software refers to any kind of software that is designed to cause damage to a single computer, server, or computer network. Ransomware, Spy ware, Worms, viruses, and Trojans are all varieties of malware.
  • Denial of Service attacks: A Denial-of-Service (DoS) attack is an attack meant to shut down a machine or network, making it inaccessible to its intended users. DoS attacks accomplish this by flooding the target with traffic, or sending it information that triggers a crash.
  • Phishing: It is the method of trying to gather personal information using deceptive e-mails and websites.
  • Social engineering is an attack that relies on human interaction to trick users into breaking security procedures in order to gain sensitive information that is typically protected.
  • SQL Injection:
    SQL (pronounced “sequel”) stands for Structured Query Language, a programming language used to communicate with databases.Many of the servers that store critical data for websites and services use SQL to manage the data in their databases.A SQL injection attack specifically targets such kind of servers, using malicious code to get the server to divulge information it normally wouldn’t.
  • Man-in-the-middle (MitM) attacks, also known as eavesdropping attacks, occur when attackers insert themselves into a two-party transaction. Once the attackers interrupt the traffic, they can filter and steal data.
  • Cross-Site Scripting (XSS):
    Similar to an SQL injection attack, this attack also involves injecting malicious code into a website, but in this case the website itself is not being attacked.Instead the malicious code the attacker has injected, only runs in the user’s browser when they visit the attacked website, and it goes after the visitor directly, not the website.

Need for Cyber Security

  • For Individuals: Photos, videos and other personal information shared by an individual on social networking sites can be inappropriately used by others, leading to serious and even life-threatening incidents.
  • For Business Organizations: Companies have a lot of data and information on their systems. A cyber attack may lead to loss of competitive information (such as patents or original work), loss of employees/customers private data resulting into complete loss of public trust on the integrity of the organization.
  • For Government: A local, state or central government maintains huge amount of confidential data related to country (geographical, military strategic assets etc.) and citizens. Unauthorized access to the data can lead to serious threats on a country.

Laws related to Cyber Security in India

Information Technology Act, 2000

  • The act regulates use of computers, computer systems, computer networks and also data and information in electronic format.
  • The act lists down among other things, following as offences:
    Tampering with computer source documents.Hacking with computer systemAct of cyber terrorism i.e. accessing a protected system with the intention of threatening the unity, integrity, sovereignty or security of country.Cheating using computer resource etc.

Strategies under National Cyber Policy, 2013

  • Creating a secure cyber ecosystem.
  • Creating mechanisms for security threats and responses to the same through national systems and processes.
    National Computer Emergency Response Team (CERT-in) functions as the nodal agency for coordination of all cyber security efforts, emergency responses, and crisis management.
  • Securing e-governance by implementing global best practices, and wider use of Public Key Infrastructure.
  • Protection and resilience of critical information infrastructure with the National Critical Information Infrastructure Protection Centre(NCIIPC) operating as the nodal agency.
    NCIIPC has been created under Information Technology Act, 2000 to secure India’s critical information infrastructure. It is based in New Delhi.
  • Promoting cutting edge research and development of cyber security technology.

WHAT IS CONFUSION MATRIX?

Confusion Matrix is indicating the classification performance. Any classification model will classify the output in the form of 0 or 1.

  1. Accuracy: How much percent correctly classified as 1 and correctly classified as 0 out of Total classification, True Positive + True Negative / Total number of Classification.
  2. Misclassification: How much percent incorrectly classified as 1 and incorrectly classified as 0 out of Total classification, False Positive + False Negative / Total number of Classification.
  3. Sensitivity: How much percent correctly classified as 1 out of Total 1. True Positive/True Positve+False Negative
  4. Specificity: How much percent correctly classified as 0 out of Total 0. True negative/True Negative+False Positive
  5. False Positive Rate: How much percent incorrectly classified as 1 out of Total 0. False Positive/False Positive+True Negative
  6. False Negative Rate: How much percent incorrectly classified as 0 out of Total 1. False Negative/False Negative+True Positive
  7. Precision: How much percent correctly classified as 1 out of Total correctly classified. True Positive / True Positive + False Positive

So typically a confusion matrix is filled up based on the test set whose true labels is known. The test data is passed through the classifier and predictions are noted. A table of predicted labels vs true labels is then filled out .

The way you fill out the confusion matrix is simple — you check the predicted label if it matches its true label then count one on the diagonal element of the corresponding class. If it is wrongly classified count one on the (predicted class, target class) element. This process is repeated for all elememts of the set under consideration.

The above is an example of a two class confusion matrix.

  • The sum of number of elements in each column will give you the number of elements belonging to that class.
  • False negatives is the most important result more often than not. They have to be as less possible. False negative is, if the target class was true and the predicted class is false. In disease related classification tasks, this can be fatal.
  • Confusion matrix will also give insights into how the class imbalance is affecting the classification and suitable measures can be taken.

So typically a confusion matrix is filled up based on the test set whose true labels is known. The test data is passed through the classifier and predictions are noted. A table of predicted labels vs true labels is then filled out .

The way you fill out the confusion matrix is simple — you check the predicted label if it matches its true label then count one on the diagonal element of the corresponding class. If it is wrongly classified count one on the (predicted class, target class) element. This process is repeated for all elememts of the set under consideration.

This is one of the performance measure to know how good our model is !

Assume we have Binary classification task (means we have two classes [0,1]). lets take 0 as negative class and 1 as positive class.

From that we will get below blue matrix -

Lets define the notations before going to calculations

  • TN — True negative (means predicted value is true)
  • FN — False negative (means predicted value is false )
  • FP — False positive (means predicted value is false )
  • TP — True positive (means predicted value is true)
  • P — Total no of Positive classes in data set D, N — Total no of negative classes in D
  • Total data points ‘D’ = P+N
  • True Positive Rate TPR = TP/P which is also known as Sensitivity
  • True Negative Rate TNR = TN/N which is also known as Specificity
  • FPR = FP/N , FNR = FN/P

So we want positive classes to be classified as positive and negative to be classified as negative. That means we need High TPR & TNR values and Low FPR & FNR values. Accordingly we need pick the best model based on our requirements.

  • One of the main draw back of the measure is that we need to compare 4 different metrics (TPR,FPR ,TNR ,FNR) to pick the best model.
  • So if we got high TN & TP values (diagonal elements ) then we can say that the model is good. Accuracy of the matrix can be calculated by taking average of diagonal elements
  • Just like accuracy , it also don’t use probability scores into consideration when picking the best model. It simply take the points count which are classified by models and give the best model.

THANK YOU!!!

--

--