Glossary
Attribute
An attribute is is a key–value representation of a piece of information about a transaction. An attribute is a result of the enrichment process: the more datapoints you send via the Fraugster API, the more attributes become available for risk analysis and rule creation. Fraugster models use attributes to score a transaction, while you use attributes to create rules.
Datapoint
A datapoint is a key–value pair that represents a piece of information about a transaction. The Fraugster API receives each transaction as a set of multiple datapoints that carry information about the transaction itself, the customer, or any other purchase-related information.
Fraugster score
Fraugster score is a probabilistic score ranging between 1 and 100. It represents the evaluated likelihood of a transaction being scored to be fraudulent, given the general probability of fraud in the population.
k-NN
A k-nearest neighbors algorithm, often abbreviated k-NN, is a supervised machine learning algorithm that can be used to solve both classification and regression problems. In its approach to data, it attempts to determine what group a data point is in by looking at the data points around it.
'k' in k-NN is a parameter that refers to the number of nearest neighbors to include in the majority of the voting process.
Logistic regression
Logistic regression is a longstanding classification algorithm. It is used to predict a binary outcome in structured data sets, based on a set of independent variables. In other words, a classification event can only have two potential outcomes given an input.
Neural network
A neural network is a computational learning system that uses a network of functions to understand and translate a data input of one form into a desired output, usually in another form. This process is modeled on the human brain and nervous system. The concept was inspired by human biology and the way neurons of the human brain function together to understand inputs from human senses.
Signal
A signal describes a behavioral pattern that Fraugster data models identified for a transaction. Signals are a product of the enrichment process – they derive from the datapoints and attributes that represent each transaction. Fraugster Engine puts together a set of signals after analyzing each transaction.
Story
A story is a human-readable text that represents the choices made by the Fraugster models. It provides insight into why the Fraugster Engine took each decision and what evidence it used to score a transaction. Stories are written by the AI Engine according to the decision it took for each transaction.
Random Forest
Random forest is an ensemble machine learning algorithm used for a variety of tasks including regression and classification. It uses a large number of small decision trees and combines their predictions to produce a more accurate result.
Tagged data
- Definition
- Tagging data at Fraugster
Tagged data refers to annotated or classified data carrying the target that you want a machine learning model to predict. Tagged data highlights data features and properties that can be analyzed to help predict the target. Also referred to as labelled data.
We aim to tag every transaction that we come across, be that perf environment or production. We can only tag data if we know what the definition of a tag is.
The definition of a tag (bad
, good
) can be standard or unique. Normally, if
a transaction received a chargeback with a fraud reason code, this transaction
would be considered bad. However, for some clients a chargeback on specific
transactions is not considered fraud. So the definition of bad is not universal
and may vary.
Before we start tagging data, clients need to share their definition of bad with us.
We add two more layers to the clients' definition:
- Manual review by our analysts and their decision for a transaction.
- Linking infrastructure that is applied to examine a transaction further. This layer is built on the other two layers and doesn’t change the client’s definition of bad.
As a result of this multi-sided tagging, each transaction ends up with at lest two tags. Additionally, client's analysts may also want to review transactions – this would create a third tag.
Untagged data
Untagged data refers to data that has not been tagged with labels to identify its characteristics or properties. Untagged data carries no targets to predict. Also referred to as unlabelled data.