# 4 Moments : Skew,Kurtosis,Mean,Variance in Scala and Apache Spark

Moments is specific quantitative measure of the shape of the data. In statistics, **moments** are used to understand the various characteristics of a probability distribution. usually we use moments to characterise the data, identify the shape of normal distribution. Moments are used measure to central tendency, dispersion, skewness and kurtosis of a distribution.So lets find out how how to calculate these Moments in Scala with spark.

### Central tendency

This is nothing but Mean, its the average value of a distribution. to calculate the central tendency we can use Imputer or Spark SQL's stats function.

### Dispersion

This Nothing but Variance,Is a measure that how far the data set is spread out, So calculate the Central tendency and dispersion refer this tutorial.

### Skewness

Skewness is a measure of symmetry, or more precisely, the lack of symmetry. A distribution, or data set, is symmetric if it looks the same to the left and right of the centre point. lets see how to calculate Skewness in spark Scala.

```
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.expressions.Window
import org.apache.spark.sql.functions._
object Main extends App {
print("hello world");
val spark = SparkSession
.builder()
.appName("test")
.config("spark.master", "local")
.getOrCreate();
var data = spark.read.format("csv").
option("header", true).load("/<data-downloaded-from mean tutorial>.csv").toDF();
data = data
.withColumn("rn",row_number()
.over(Window.orderBy("year"))).toDF();
data = data.filter(data("rn") > 2).toDF();
data.filter(data("value") !== "C").agg(skewness(data("value"))).show();
}
```

### Kurtosis

Kurtosis is a measure of whether the data are heavy-tailed or light-tailed relative to a normal distribution. That is, data sets with high kurtosis tend to have heavy tails, or outliers. Data sets with low kurtosis tend to have light tails, or lack of outliers. A uniform distribution would be the extreme case. So to calculate the Kurtosis in Scala spark refer the code below.

```
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.expressions.Window
import org.apache.spark.sql.functions._
object Main extends App {
print("hello world");
val spark = SparkSession
.builder()
.appName("test")
.config("spark.master", "local")
.getOrCreate();
var data = spark.read.format("csv").
option("header", true).load("/<data-downloaded-from mean tutorial>.csv").toDF();
data = data
.withColumn("rn",row_number()
.over(Window.orderBy("year"))).toDF();
data = data.filter(data("rn") > 2).toDF();
data.filter(data("value") !== "C").agg(skewness(data("value"))).show();
dta.filter(data("value") !== "C").agg(kurtosis(data("value"))).show()
}
```