Code Monkey home page Code Monkey logo

anomaly-detection-in-video-sequences's Introduction

Event-Anomaly-Detection

Anomalies are rare and of varying nature and supervised learning approaches will fall short in terms of performance when dealing with such data. Auto-encoders however, are perfect for this situation, because they can be trained on normal parts and don't require annotated data. Once trained, we can give it a feature representation for a part and compare autoencoder output with input. The larger the difference, the more likely the input contains an anomaly.

  • We're submitting a .ipynb file with our project which is easy to run once you have the necessary dataset 'UCSD’s Anomaly Detection Dataset’, more importantly it’s UCSD ped1 folder.
  • Our file has section named CAE, Spatio-Temporal Stacked CAE, LSTM- Based Stacked CAE, Regularity Scores Comparison and Training Loss Comparison, the first 3 of which are further subdivided into ‘Training Phase’ and ‘Testing Phase’. Each of the latter mentioned modules can be run as standalone.
  • If you’re running the code on colab (and we advise you to do so), store the code file in the same folder as the data.
  • We’re using mxnet deep learning module because we wanted to experiment with some other library than TensorFlow, Keras, and Pytorch and mxnet is Apache’s alternative to them. To download mxnet (only for colab files) use:
!pip mxnet-cu100

A snippet of anamoly Detection

108cae

The red dots in the anamoly picture can be witnessed in real time 1. CAE:

1.1 Training Phase:

  • We resized the original images into (1,100,100) form using the code snippet shown below as it was conducive to our experiment.
UCSD_FOLDER=os.path.join(DRIVE_MOUNT, 'My Drive', 'UCSD_Anomaly_Dataset.v1p2')
train_files = sorted(glob.glob(UCSD_FOLDER+ '/UCSDped1/Train/*/*'))
train_images = np.zeros((len(train_files),1,100,100))
for idx, filename in enumerate(train_files):
    im = Image.open(filename)
    im = im.resize((100,100))
    train_images[idx,0,:,:] = np.array(im, dtype=np.float32)/255.0
np.save(UCSD_FOLDER+ '/UCSD_Anomaly_Dataset.v1p2.npy',train_images)
  • Once the images are saved, you can run the model’s ‘Training Phase’ till it’s end without any changes. Make sure the file is running on the gpu as it has the command ctx = gpu()which will try to access the gpu.
  • Once the training is done, we are saving the model’s parameters for easy access during the testing especially, if you want to just run the testing part of the code. Any changes in the training parameters can be made in this cell before the training loop starts.
im_train = np.load(UCSD_FOLDER+ '/UCSD_Anomaly_Dataset.v1p2.npy')
batch_size= 32
dataset = gluon.data.ArrayDataset(mx.nd.array(im_train, dtype=np.float32))
dataloader = gluon.data.DataLoader(dataset, batch_size=batch_size, last_batch='rollover',shuffle=True)
ctx = gpu()
num_epochs = 50
model = ConvolutionalAutoencoder()
model.hybridize()
model.collect_params().initialize(mx.init.Xavier('gaussian'), ctx=ctx)
loss_function = gluon.loss.L2Loss()
optimizer = gluon.Trainer(model.collect_params(), 'adam', {'learning_rate': 1e-4, 'wd': 1e-5})

1.2 Testing Phase:

  • Herein, we are using the images from Test024 and Test024_gt files from UCSDped1, you can replace them with some other file with images and ground truth herein if you want. Following this we resize the test images and mask in the same manner as the training images and save them as a dataloader object.
test_file = sorted(glob.glob(UCSD_FOLDER +'/UCSDped1/Test/Test024/*'))
test_file_gt = sorted(glob.glob(UCSD_FOLDER +'/UCSDped1/Test/Test024_gt/*'))
a = np.zeros((len(test_file_gt),2,100,100))
for idx,filename in enumerate(test_file):
    im = Image.open(filename)
    im = im.resize((100,100))
    a[idx,0,:,:] = np.array(im, dtype=np.float32)/255.0

for idx,filename in enumerate(test_file_gt):
    im = Image.open(filename)
    im = im.resize((100,100))
    a[idx,1,:,:] = np.array(im, dtype=np.float32)/255.0

dataset = gluon.data.ArrayDataset(mx.nd.array(a, dtype=np.float32))
dataloader = gluon.data.DataLoader(dataset, batch_size=1)
  • Following are the 3 helper functions provided for test-time evaluation for CAE and were modified later to accommodate the Stacked models:
def plot_regularity_score(model,dataloader):
  """
  Calculated regularity score per frame:
  Regularity Score = 1 - (e_t - min@t(e_t))/max@t(e_t)
  where e_t = sum over pixelwise l2 loss for each frame
  """
  e_t = []
  for image in dataloader:
    img = image[:,0,:,:].reshape(1,1,image.shape[-2],image.shape[-1])
    img = img.as_in_context(mx.gpu())
    output = model(img)
    output = (output.asnumpy().squeeze()*255).reshape(100*100,1)
    img = (img.asnumpy().squeeze()*255).reshape(100*100,1)
    e_xyt = np.linalg.norm(output-img,axis=1,ord=2)
    e_t.append(np.sum(e_xyt))
  e_t_min = min(e_t)
  e_t_max = max(e_t)
  reg_scores = []
  for i in range(len(e_t)):
    reg_scores.append(1 - ((e_t[i]-e_t_min)/e_t_max))
  return reg_scores

model =  ConvolutionalAutoencoder()
model.load_parameters(UCSD_FOLDER+ "/autoencoder_ucsd.params",ctx=ctx)
reg_scores_cae = plot_regularity_score(model,dataloader)
def plot_anomaly(img, output, diff, H, threshold, counter,UCSD_FOLDER):
  """
  Plots the images along the axis to show the input, output of the model,
  difference between the 2, and their predicted anomalies as red dots on
  the input image.
  """
    fig, (ax0, ax1, ax2,ax3) = plt.subplots(ncols=4, figsize=(10, 5))
    ax0.set_axis_off()
    ax1.set_axis_off()
    ax2.set_axis_off()
    ax0.set_title('input image')
    ax1.set_title('reconstructed image')
    ax2.set_title('diff ')
    ax3.set_title('anomalies')
    ax0.imshow(img, cmap=plt.cm.gray, interpolation='nearest') 
    ax1.imshow(output, cmap=plt.cm.gray, interpolation='nearest')   
    ax2.imshow(diff, cmap=plt.cm.viridis, vmin=0, vmax=255, interpolation='nearest')  
    ax3.imshow(img, cmap=plt.cm.gray, interpolation='nearest')
    x,y = np.where(H > threshold)
    ax3.scatter(y,x,color='red',s=0.1) 
    plt.axis('off')
    fig.savefig(UCSD_FOLDER+'/images/' + str(counter) + '.png')
def model_evaluation(model,dataloader):
  loss_l2_per_frame = []
  threshold = 4*255
  counter = 0
  test_loss_metric = gluon.loss.SigmoidBCELoss()
  loss_per_frame = 0
  im_list = []
  i = 0
  for image in dataloader:
    counter = counter + 1
    img = image[:,0,:,:].reshape(1,1,image.shape[-2],image.shape[-1])
    mask = image[:,1,:,:].as_in_context(mx.gpu())
    img = img.as_in_context(mx.gpu())
    output = model(img)
    output = output.transpose((0,2,3,1))
    img = img.transpose((0,2,3,1))
    output = output.asnumpy()*255
    img = img.asnumpy()*255
    diff = np.abs(output-img) 
    tmp = diff[0,:,:,0]
    H = signal.convolve2d(tmp, np.ones((4,4)), mode='same')
    H_new = mx.nd.array(np.where(H>threshold,1,0).reshape((1,100,100)),ctx=gpu())
    loss = test_loss_metric(H_new, mask)
    loss_l2_per_frame.append(loss.asscalar())
    plot_anomaly(img[0,:,:,0], output[0,:,:,0], diff[0,:,:,0], H, threshold, counter,UCSD_FOLDER)

  print("Total loss per frame for anomalies predicted = ",sum(loss_l2_per_frame)/len(dataloader))
  • The commands needed to call the function(s) is contained within the same cells.
  • We’ve also provided the code to create the video file from the images stored by the last command of the ‘plot_anomaly’ function. This function is used as same by the following models. The same function is made available in the code as the last cell of ‘Testing Phase’ for the following models.
## Saving the output plots as video depicting anomalies ##
import cv2
out_im = sorted(glob.glob(UCSD_FOLDER+ '/images/*.png'))

img_array = []
for filename in out_im:
    img = cv2.imread(filename)
    height, width, layers = img.shape
    # size = (width,height)
    img_array.append(img)

size = (360, 720)
_name = UCSD_FOLDER+'/vid' + '.mp4'
# self._cap = VideoCapture(0)
fourcc = cv2.VideoWriter_fourcc(*'MP4V')
out_vid = cv2.VideoWriter(_name,fourcc,15,size)

for i in range(0,199):
  out_vid.write(img_array[i])

out_vid.release()

2. Spatio-Temporal Stacked CAE:

2.1 Training Phase:

files = sorted(glob.glob(UCSD_FOLDER+'/UCSDped1/Train/*/*'))
train_images = np.zeros((int(len(files)/n), n, 227, 227))
i = 0
idx = 0
for filename in range(0, len(files)):
    im = Image.open(files[filename])
    im = im.resize((n,n))
    a[idx,i,:,:] = np.array(im, dtype=np.float32)/255.0
    i = i + 1
    if i >= n:
      idx = idx + 1
      i = 0
np.save(UCSD_FOLDER + '/stacked_cae.npy',train_images)
  • The code above, depicts the method to create the stacked training images array from original images resized to (n,227,227) for the Stacked CAE and LSTM Stacked CAE models. It should be run only once and then you can access the stacked images from the UCSD_FOLDER + '/stacked_cae.npy' which will be created via this code snippet.
  • Same as before, you can run the rest of the code in the part without any extra efforts. Any changes in the training parameters can be made at:
ctx = gpu()
im_train = np.load(UCSD_FOLDER + '/stacked_cae.npy')
batch_size=32
dataset = gluon.data.ArrayDataset(mx.nd.array(im_train, dtype=np.float32))
dataloader = gluon.data.DataLoader(dataset, batch_size=batch_size, last_batch='rollover',shuffle=True)
num_epochs = 50
model = convSTAE()
model.hybridize()
model.collect_params().initialize(mx.init.Xavier('gaussian'), ctx=ctx)
loss_function = gluon.loss.L2Loss()
optimizer = gluon.Trainer(model.collect_params(), 'adam', {'learning_rate': 1e-4, 'wd': 1e-5})

2.2 Testing Phase:

  • The shape of the test images and their anomaly masks are also modified as per the requirements of the model. We are using separate data loaders for masks and images.
model =  convSTAE()
model.load_parameters(UCSD_FOLDER +'/autoencoder_stacked_ucsd.params',ctx=ctx)
batch_size= 1
n=10 
test_file = sorted(glob.glob(UCSD_FOLDER+ '/UCSDped1/Test/Test024/*'))
test_file_gt = sorted(glob.glob(UCSD_FOLDER+'/UCSDped1/Test/Test024_gt/*'))
a = np.zeros((int(len(test_file)/n), n, 227, 227))
i = 0
idx = 0
for filename in range(0, len(test_file)):
    im = Image.open(test_file[filename])
    im = im.resize((227,227))
    a[idx,i,:,:] = np.array(im, dtype=np.float32)/255.0
    i = i + 1
    if i >= n:
      idx = idx + 1
      i = 0

b = np.zeros((int(len(test_file_gt)/n), n, 227, 227))
i = 0
idx = 0

for filename in range(0, len(test_file_gt)):
    im = Image.open(test_file_gt[filename])
    im = im.resize((227,227))
    b[idx,i,:,:] = np.array(im, dtype=np.float32)/255.0
    i = i + 1
    if i >= n:
      idx = idx + 1
      i = 0
## Test-time dataloaders for true images and their anomaly masks ##
dataset = gluon.data.ArrayDataset(mx.nd.array(a,ctx= ctx,dtype=np.float32))
dataloader = gluon.data.DataLoader(dataset, batch_size=1)
test_dataset = gluon.data.ArrayDataset(mx.nd.array(b,ctx= ctx, dtype=np.float32))
test_dataloader = gluon.data.DataLoader(dataset, batch_size=1)
  • 'loss_compute' function is used to evaluate the model as we need to extract the stacked frames and process them individuals. It needs the model, test images dataloader, test masks dataloader and the UCSD folder path.
def loss_compute(output,image_gt,image,UCSD_FOLDER,counter):
  loss_l2_per_frame = []
  test_loss_metric = gluon.loss.SigmoidBCELoss(from_sigmoid=False)
  # there will be 10 chnannels rep each image flatten them out
  output = output.asnumpy().squeeze()*255
  image_gt= image_gt.asnumpy().squeeze()
  image= image.asnumpy().squeeze()*255
  threshold = 4*255
  for i in range(0,10):
    counter+=1
    im_out = output[i,:,:]
    im = image[i,:,:]
    diff = np.abs(im_out-im)
    H = signal.convolve2d(diff, np.ones((4,4)), mode='same')
    H_new = mx.nd.array(np.where(H>threshold,1,0).reshape((1,227,227)),ctx=gpu())
    mask =  mx.nd.array(image_gt[i,:,:].reshape((1,227,227)),ctx=gpu())
    loss_l2_per_frame.append(test_loss_metric(H_new,mask).asscalar())
    plot_anomaly(im, im_out,diff, H, threshold, counter,UCSD_FOLDER)
  return loss_l2_per_frame
  • But, to call the ‘plot_anomaly’ and ‘loss_compute’, we need to use the ’model_evaluation’ cell, which then automatically gives you the loss value and the anomaly images.
def model_evaluation(model,dataloader,test_dataloader,UCSD_FOLDER):
  loss = []
  im_list = []
  counter = 0
  for image,image_gt in zip(dataloader,test_dataloader):
    output = model(image)
    l = loss_compute(output,image_gt,image,UCSD_FOLDER,counter)
    counter+=10
    loss.extend(l)
  print("Total loss per frame for anomalies predicted = ",sum(loss)/len(loss))
## Call statement  for evaluation ##
model_evaluation(model,dataloader,test_dataloader,UCSD_FOLDER)

3. LSTM- Based Stacked CAE:

3.1 Training Phase:

  • For LSTM-network we don’t need to modify the training images, as we can simply use the stacked images from the UCSD_FOLDER + '/stacked_cae.npy' we created for the previous model but as there is a chance one may wish to just observe the results for this model, I’ve added the image synthesis code in this part for LSTM net.
  • Same as in case of previous models, if you encounter the following cell you can use it to modify the network training parameters and can run the training loop without any further action required on your part.
batch_size=8
dataset = gluon.data.ArrayDataset(mx.nd.array(im_train, dtype=np.float32))
dataloader = gluon.data.DataLoader(dataset, batch_size=batch_size, last_batch='rollover',shuffle=True)
model = ConvLSTMAE()
ctx = gpu()
num_epochs = 50
model.hybridize()
model.collect_params().initialize(mx.init.Xavier(), ctx=mx.gpu())
loss_function = gluon.loss.L2Loss()
optimizer = gluon.Trainer(model.collect_params(), 'adam', {'learning_rate': 1e-4, 'wd': 1e-5})

3.2 Testing Phase:

  • Test part of the LSTM network is the same as the Stacked CAE except for the ‘states’ of the model’s temporal encoder being input to the ‘model_evaluation’ function as follows:
batch_size=8
dataset = gluon.data.ArrayDataset(mx.nd.array(im_train, dtype=np.float32))
dataloader = gluon.data.DataLoader(dataset, batch_size=batch_size, last_batch='rollover',shuffle=True)
model = ConvLSTMAE()
ctx = gpu()
num_epochs = 50
model.hybridize()
model.collect_params().initialize(mx.init.Xavier(), ctx=mx.gpu())
loss_function = gluon.loss.L2Loss()
optimizer = gluon.Trainer(model.collect_params(), 'adam', {'learning_rate': 1e-4, 'wd': 1e-5})
states = model.temporal_encoder.begin_state(batch_size=batch_size, ctx=ctx)

4. Regularity Scores Comparison:

  • This section contains the code to generate comparison plots and the function to calculate regularity score on stacked test images. Running this, will provide you with the visual of the individual model scores to compare.
def plot_regularity_score_on_stacked_images(model,dataloader,states=None,lstm=False):
  """
  Calculated regularity score per frame:
  Regularity Score = 1 - (e_t - min@t(e_t))/max@t(e_t)
  where e_t = sum over pixelwise l2 loss for each frame
  """
  e_t = []
  for image in dataloader:
    img = image.as_in_context(gpu())
    if lstm:
      output ,_ = model(img,states)
    else:
      output = model(img)
    output = output.asnumpy().squeeze()*255
    img = img.asnumpy().squeeze()*255
    for i in range(output.shape[0]):
      a = output[i,:,:].reshape(227*227,1)
      b = img[i,:,:].reshape(227*227,1)
      e_xyt = np.linalg.norm(a-b,axis=1,ord=2)
      e_t.append(sum(e_xyt))
  e_t_min = min(e_t)
  e_t_max = max(e_t)
  reg_scores = []
  for i in range(len(e_t)):
    reg_scores.append(1 - ((e_t[i]-e_t_min)/e_t_max))
  return reg_scores

model_stcae = convSTAE()
model_stcae.load_parameters(UCSD_FOLDER +'/autoencoder_stacked_ucsd.params',ctx=ctx)
model_lstm =  ConvLSTMAE()
model_lstm.load_parameters(UCSD_FOLDER +'/autoencoder_lstm_ucsd.params',ctx=ctx)
reg_scores_stcae = plot_regularity_score_on_stacked_images(model_stcae,dataloader,lstm=False)
states = model_lstm.temporal_encoder.begin_state(batch_size=batch_size, ctx=ctx)
reg_scores_lstm = plot_regularity_score_on_stacked_images(model_lstm,dataloader,states,lstm=True)
## Plots
plt.plot(reg_scores_cae,color ='red')
plt.plot(reg_scores_stcae,color = 'green')
plt.plot(reg_scores_lstm,color='blue')
plt.xlabel("frame number")
plt.ylabel("regularity score")
plt.title( "Regularity Score per frame")
plt.legend(['CAE','STCAE', 'LSTM-STCAE'])
plt.show()

5. Training Loss Comparison:

  • In this section, we call on the files that store the model loss during training to plot them on a single pane for comparative analysis.
loss_train = np.load(UCSD_FOLDER+'/loss_train_cae.npy')
loss_train_stacked = np.load(UCSD_FOLDER+'/loss_train_stacked.npy')
loss_train_lstm = np.load(UCSD_FOLDER+'/loss_train_lstm.npy')
plt.plot(loss_train,'r')
plt.plot(loss_train_stacked,'g')
plt.plot(loss_train_lstm,'b')
plt.title('Training Loss vs Epochs')
plt.legend(['CAE','STCAE','LSTM-STCAE'])
plt.xlabel('epoch')
plt.ylabel('reconstruction loss')
plt.show()

anomaly-detection-in-video-sequences's People

Contributors

gauriprdhn avatar sailalitha96 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

anomaly-detection-in-video-sequences's Issues

Making dataset for the stacked CAE

Thank you for your code and for sharing it.
There is one point I wanted to share with you.

When preparing cubes of n images as input (for example [32,10,100,100]), the difference between folders are not taken into consideration. it means that this cube below (which is prepared) would not be a proper one:

[frame198-folder 2, frame 199-folder2, frame 200-folder2, frame1-folder 3, frame 2-folder3, .....]

as you can see in the list above, some frames from different adjacent folders would be mixed. As the folders are for different timing it would not be a continuous combination.

What will be the value of n in:::: train_images = np.zeros((int(len(files)/n), n, 227, 227))

Respected Sir thanks for the code. But what will be the value of "n" here ?

  1. Spatio- Temporal Layer Stacked CAE:
    2.1 Training Phase:

files = sorted(glob.glob(UCSD_FOLDER+'/UCSDped1/Train//'))
train_images = np.zeros((int(len(files)/n), n, 227, 227))
i = 0
idx = 0
for filename in range(0, len(files)):
im = Image.open(files[filename])
im = im.resize((n,n))
a[idx,i,:,:] = np.array(im, dtype=np.float32)/255.0
i = i + 1
if i >= n:
idx = idx + 1
i = 0
np.save(UCSD_FOLDER + '/stacked_cae.npy',train_images)

After correctly executing above code I got error:.......................................................................

NameError Traceback (most recent call last)
in ()
1 files = sorted(glob.glob(UCSD_FOLDER+'/UCSDped1/Train//'))
----> 2 train_images = np.zeros((int(len(files)/n), n, 227, 227))
3 i = 0
4 idx = 0
5 for filename in range(0, len(files)):

NameError: name 'n' is not defined

Augmentation for the dataset

The second thing that should be noticed in the dataset is the number of training data for the stacked-layer Conv AE. from the total of 6800 frame numbers in the ped1 dataset you extract 680 training cubes, which is quite low for learning the complexities in the video. apart from the different temporal augmentation methods that you can apply, there is more important thing. lets suppose we have just 20 frames dataset as below:
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20.
what you have done is extracting just two training cube as:
1,2,3,4,5,6,7,8,9,10 --------------------- and -------------------10,11,12,13,14,15,16,17,18,19,20.
However, you can, change your step to 1 instead of 10, as below:

1,2,3,4,5,6,7,8,9,10--- and---2,3,4,5,6,7,8,9,10,11--and---3,4,5,6,7,8,9,10,11,12,---and .....
...
10,11,12,13,14,15,16,17,18,19---and ---10,11,12,13,14,15,16,17,18,19,20

now with this technique, for the dataset of 20 frames, you extract 18 cubes instead of 2!. which means you have 9 times more data.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.