Blog

Quranic search engine

main goal

building a search engine on the Quran that allows us to:

find specific verses even if the search word’s pronounes are incorrect.
find all verses about a specific concept.
collect acattered quranic stories together.
allow both Arabic and English search.

Cleaning The Data

usual issues with text data (removing numbers, punctuation, diacritics …)
- re package was used to deal with this problem.
Basmalah issue:
- at the arabic version there was an extra basmalah attached to each first verse in each chapter.
  while the english version has no extra basmalah’s.
  - so I removed the extra basmalahs and added a basmalah before the first verse in each chapter with verse number = 0

EDAs

mainly I compared between makki chapters and madani chapters.

Search engine

to achive the goals mentioned above we need to:

stem each verse and stem the search word befor searching
- The Arabic version was stemmed by Farasa stemmer.
- The English version was stemmed by nltk stemmer namly PorterStemmer
tag all verses discussing a specific concept by a tag, then search by tag to get all these verses.
- almost manually, firstly by names. For example, if the verse contains “messiah”, “son of mary” or “jesus” it will be tagged by “jesus”.
- another approach is to look up (either by graping a quran or otherwise) what verses are discussing Moses stories (for example), note them and then tag them.

The Data

There are 2 data files, one is the arabic Quran
and the other one is an English translated version of the Quran.

each data file has three important features:
1- The number of the chapter (Surah).
2- The number of the verse (Ayah) within the chapter.
3- the verse either in Arabic or in english.

dara sources

The Arabic Data:
Tanzil Quran Text (Simple, Version 1.1)
Copyright (C) 2007-2022 Tanzil Project
License: Creative Commons Attribution 3.0
This copy of the Quran text is carefully produced, highly
verified and continuously monitored by a group of specialists
at Tanzil Project.
Please check updates at: http://tanzil.net/updates/
English transelation version:
Saheeh International translation from
https://quranenc.com/en/browse/english_saheeh

countparticles

Report the number of particles in each class from a run_data.star file produced by RELION.

A single-particle cryo-EM reconstruction comes from a set of particle images corresponding to projections of identical particles in different orientations. All datasets are heterogeneous, to various degrees, and data analysis involves classification of particle images. Knowing how many particles contributed to any given class is important to decide how to follow up after a classification job. This command-line tool reports a count of particles in each class in a run_it???_data.star file from a RELION Class2D or Class3D job. It can also optionally produce a bar plot of these particle counts.

This tool was tested with star files produced by RELION-3.1.0. Earlier versions of RELION are not supported.

Acknowledgments

I would not have been able to put this tool together without the starfile library.

Installation

I recommend to install this tool in a dedicated conda environment. You can create one like so (replace ENV_NAME with the name you want to give to this environment):

$ conda deactivate
$ conda create --name ENV_NAME python=3.9
$ conda activate ENV_NAME

Once the conda environment is active, you can install the tool with the following command:

$ pip install countparticles

Usage

$ countparticles --help
Usage: countparticles [OPTIONS] <run_data.star>

  Report the number of particles in each class from a run_data.star file
  produced by RELION.

Options:
  -p, --plot         Optional. Display a bar plot of the particle counts. This
                     is most helpful with only a few classes, e.g. for typical
                     Class3D results (but not for typical Class2D results with
                     many classes).

  -o, --output TEXT  Optional. File name to save the barplot (recommended file
                     formats: .png, .pdf, .svg or any format supported by
                     matplotlib). This option has no effect without the
                     -p/--plot option.

  -h, --help         Show this message and exit.

This is a tiny app i made, that you would run locally on your workstation and it will dispatch osquery queries to the machines under your command. The commands are listed in the main root route of the app so you don’t need to dig deep. And i really like osquery project, if you have never checked it out, you should probably take a look.

Requirement

The only requirement is that your target machines should have osquery installed on them, thats it..

Install

Standart Procedure

go get github.com/emirozer/exposq

Lets assume you are going to run exposq from your home directory(/home/user/). After running the command above, you need to create a file called targets.json in your /home/user/

Example formatting of targets.json file:

Important Notes : It expects a private key and you can give a key file specific to a target like the following json structure

{
    targets: [
        {
            "user": user,
            "ip": ip,
            "key": "key file",
        },
        {
            "user": user,
            "ip": ip
        }
    ],
    "key": "global key file"
}

Usage

After that just run:

$>exposq

Open up your browser and go

localhost:3000

And the main route will show you which queries you can dispatch :

Examples:

Check if any of your machines are being used as a relay:

Check if any of your machines are a victim of mitm:

Check the uptime of your machines:

이벤트 기반 주문-결제 안전하게 구현하기

주문 및 결제 서버를 분리하고 이벤트로 주문-결제 프로세스를 구현하였습니다.

장애 전파 방지 – 주문 서비스가 동작한다면 결제 서비스에서 일시적으로 장애가 발생하더라도 복구가 되었을 때 주문 및 결제가 성공합니다.
확장 가능성 – 주문 이벤트를 발행하기만 하면 컨슈머가 자유롭게 이벤트를 소비하고, 컨슈머를 확장할 수 있어 기능 확장에 유리합니다.

위 이점을 최대한 활용할 수 있도록 고려하며 이벤트 기반으로 주문 결제 기능을 구현하였습니다.

주문 도메인 이벤트 발행 구현 상세

이벤트 발행 기능 설계에서도 기술과 비즈니스 관심사를 분리하기 위한 설계를 진행했습니다. 주문-결제 이벤트 발행은 Transaction Outbox 패턴과 RabbitMQ를 사용하지만, 애플리케이션 계층은 도메인 이벤트 인터페이스에만 의존하여 Transaction Outbox 패턴을 사용하지 않게 되거나 RabbitMQ가 아닌 다른 메세지 브로커를 사용하게 되더라도 변경이 전파되지 않도록 설계했습니다.

도메인 계층 클래스 다이어그램

도메인 계층에서는 이벤트 관심사를 위와 같이 구현하였습니다.

DomainEvent를 인터페이스로 정의하였습니다.
실제 도메인 이벤트(ex-OrderCreatedEvent)는 DomainEvent를 구현합니다.

도메인 객체는 이벤트 발생시 DomainEvent 인스턴스를 생성하고, DomainEventArchive에 기록합니다.

DomainEventArchive는 DomainEvent 들을 List로 관리합니다.
상속 대신 Composition을 선택하여 도메인 객체의 확장성을 높이고 Lombok 어노테이션을 활용해 코드 중복을 최소화 했습니다.

public class Order {
  private UUID id;
  // 기타 도메인 속성 필드들
  ...
  
  // Composition으로 도메인 객체 확장 가능성을 열어둠
  // Lombok의 @Delegate를 사용하여 중복 코드 작성 최소화
  @Delegate
  @Builder.Default
  private DomainEventArchive archive = new DomainEventArchive();    
}

인프라 계층 클래스 다이어그램

인프라 계층에서는 위와 같이 Transaction Outbox 패턴을 구현하였습니다.

OutboxEventPersister가 DomainEvent를 OutboxEvent로 변환하고 DB에 저장합니다.
OutboxEventPublisher는 다음 단계들을 걸쳐 이벤트를 처리합니다.
1. Redis에서 ACK 처리된 이벤트 목록을 조회하고, DB에 해당 이벤트들이 전송 완료되었다는 정보를 업데이트 합니다.
2. DB에서 publish되지 않은 (=ACK처리 되지 않은) 이벤트들을 조회하여 publish합니다.
3. publish 된 이벤트들은 RabbitTemplate Publish Confirm 설정에 의해 ACK응답이 온 이벤트들을 Redis에 캐싱합니다.
4. 위 과정을 반복하며 이벤트가 최소 한 번 전송되는 것을 보장합니다.
OutboxEventScheduler를 등록하여 위 작업이 짧은 주기로 실행될 수 있도록 합니다.
- OutboxEventScheduler ShedLock을 적용하여 Scale Out시에도 하나의 서버가 Transaction Outbox 스케줄링을 수행하도록 하였습니다.

이처럼 Write Back 전략으로 이벤트 발행과 ACK를 처리하여 이벤트 발행이 시스템 리소스를 최소한으로 사용할 수 있도록 하면서도 최소 한 번 전송을 보장할 수 있도록 구현하였습니다.

애플리케이션 서비스 구현 위 구조를 통해 애플리케이션 계층에서는 인프라 계층의 기술 구현을 이해하지 않으면서도 비즈니스 로직 흐름을 명확히 표현할 수 있게 되었습니다.

@Service
@RequiredArgsConstructor
public class OrderService {
    // 도메인 계층의 인터페이스들에만 의존
    private final OrderRepositoryPort orderRepository;
    private final EventPublisher eventPublisher;

    @Transactional
    public Mono<UUID> createOrder(CreateOrderCommand command) {
        Order order = createOrderFromCommand(command);
        
        return orderRepository.save(order)
                .then(eventPublisher.publishAll(order.getEvents()))
                .then(Mono.fromRunnable(order::clearEvents))
                .then(Mono.fromCallable(order::getId));
    }
    
    ...
}

OrderService 의 코드 예시를 보면 인프라 계층의 구현체로부터 완벽히 관심사가 분리되어 단순히 Order 도메인 객체를 생성하고 이벤트를 발행하는 객체간의 메세지만 전달하여 비즈니스 로직에 집중할 수 있습니다.

결제 서비스에서의 이벤트 소비

멱등성을 보장하는 이벤트 소비

결제 서비스에서는 이벤트를 멱등성 있게 소비할 수 있도록 구현했습니다.

EventConsumer 구현

@Bean
public Function<Flux<Message<String>>, Flux<Void>> orderCreatedConsumer() {
    return flux -> flux
            .flatMap(this::consumeEvent)
            .onErrorContinue((error, obj) -> {
                log.error("Error processing order payment: {}", error.getMessage(), error);
            });
}

@Transactional
private Mono<Void> consumeEvent(Message<String> message) {
    OrderCreatedEvent event = MapperUtils.fromJson(message.getPayload(), OrderCreatedEvent.class);

    return processedEventRepository.findById(event.eventId())
            .hasElement()
            .flatMap(exists -> {
                if (exists) {
                    return Mono.empty();
                }
                return processedEventRepository.save(new ProcessedEvent(event.eventId()))
                        .then(paymentService.processPayment(createPayment(event)))
                        .then();
            });
}

PaymentService 구현

@Transactional
public Mono<Void> processPayment(Payment payment) {
    return paymentMethodRepository.findById(payment.getPaymentMethodId())
            .flatMap(paymentMethod -> paymentClient.processPayment(payment, paymentMethod))
            .then(paymentRepository.save(payment))
            .then(eventPublisher.publishAll(payment.getEvents()))
            .then(Mono.fromRunnable(payment::clearEvents))
            .then();
}

eventId값을 통해 이벤트가 이전에 처리되지 않았는지 확인합니다.
이벤트가 처리되었음을 우선 마킹하고 이후 트랜잭션 처리를 수행합니다.
도메인에서 제공하는 Service 인터페이스를 통해 결제 처리를 위임하고, Application Service에서 비즈니스 로직을 집중해서 수행할 수 있도록 합니다.

DLQ를 활용한 이벤트 실패 처리

결제 서비스 다운으로 인해 이벤트 처리가 일정시간 이상 진행되지 않았거나, 외부 요인(PG 서버) 등의 문제로 계속해서 이벤트 처리가 실패한다면 적정선에서 결제 실패 처리를 해야합니다.

이는 결제 서비스와 메세지브로커의 부하를 줄이기 위해서 이기도 하지만 사용자 입장에서 주문 실패에 대한 대처를 할 수 있도록 하기 위함이기도 합니다.

이에 대한 대처를 위해 다음과 같은 설정을 하였습니다.

일시적인 오류에는 대응할 수 있도록 최대 3번 재시도할 수 있도록 설정하였습니다.
결제 서비스 자체가 다운된 경우를 대비하여 메세지의 TTL을 10분으로 설정하였습니다.
- 주문 생성 Outbox 이벤트의 TTL이 5분이므로 재시도 및 지연을 고려하여 여유롭게 설정
DLQ로 이동한 메세지는 다시 결제 서비스가 소비하여 결제 실패 이벤트를 생성하고 발행

DDD, Hexagonal Architecture 설계

모듈 분리 및 모듈 의존 관계 방향

DDD와 Hexagonal Architecture 의 개념을 적용해 모듈을 분리하고 의존성 방향이 도메인을 향하도록 다음과 같이 설계하였습니다.

Domain module
- 도메인 로직에 집중하고 객체간 메세지를 통해 기능이 구현될 수 있도록 설계
- 기술에 의존적인 코드는 기본적으로 지양
- 단, 기술 교체 가능성과 구현 비용의 트레이드 오프를 고려하여 유연하게 적용
Infrastructure module
- 특정 기술적 관심사를 실제로 구현하는 모듈
- 모듈 내부적으로도 책임을 분리하여 특정 관심사에 대한 기술적 구현만 하도록 하고, 비즈니스 로직의 흐름은 Service를 통해 수행하도록 책임 분리
Application module
- 도메인 모듈의 객체와 인터페이스를 이용해 애플리케이션 수준에서의 비즈니스 로직 구현
- 외부와의 통신을 담당하는 인터페이스 구현

parsepub

Overview

parsepub is a universal tool written in Kotlin designed to convert an EPUB publication into a data model used later by a reader. In addition it also provides validation and a system that informing about the inconsistency of the format.

Features

converting the publication to a model containing all resources and necessary information
providing EPUB format support in versions 2.0 and 3.0 for all major tags
handling inconsistency errors or lack of necessary elements in the publication structure
support for displaying information when element structure attributes are missing

Restrictions

In order for program to work properly the EPUB file must be created in accordance with the format requirements.
Spec for EPUB 3.0
Spec for EPUB 2.1

Base model – description

The EpubBook class contains all information from an uncompressed EPUB publication. Each of the parameters corresponds to a set of information parsed from the elements of the publication structure.

data class EpubBook (
    val epubOpfFilePath: String? = null,
    val epubTocFilePath: String? = null,
    val epubCoverImage: EpubResourceModel? = null,
    val epubMetadataModel: EpubMetadataModel? = null,
    val epubManifestModel: EpubManifestModel? = null,
    val epubSpineModel: EpubSpineModel? = null,
    val epubTableOfContentsModel: EpubTableOfContentsModel? = null
)

epubOpfFilePath – Contains absolute path to the .opf file.
epubTocFilePath – Contains absolute path to the .toc file.
epubCoverImage – Contains all information about the publication cover image.
epubMetadataModel – Contains all publication resources.
epubManifestModel – Contains all basic information about the publication.
epubSpineModel – Contains list of references in reading order.
epubTableOfContentsModel – Contains table of contents of the publication.

More info about the elements of the publication in the
“Information about epub format for non-developers” section

Quick start

To convert the selected EPUB publication, create an instance of the EpubParser class

val epubParser = EpubParser()

next call parse method on it

epubParser.parse(inputPath, decompressPath)

This method returns an EpubBook class object and has two parameters:
inputPath – the path to the EPUB file,
decompressPath – path to the place where the file should be unpacked

Error handling in the structure of the publication

The structure of the converted file may be incorrect for one main reason – no required elements of publications such as Metadata, Manifest, Spine, Table of Contents.

Solution – ValidationListeners
To limit the unexpected effects of an incorrect structure, we can create an implementation for properly prepared listeners that will alert us when the format will be wrong.
On the previously created instance of the EpubParser() class, we call the setValidationListeners method, in the body of which we create the implementation of our listeners.
Each listener has been assigned to a specific element.

epubParser.setValidationListeners {
   setOnMetadataMissing { Log.e(ERROR_TAG, "Metadata missing") }
   setOnManifestMissing { Log.e(ERROR_TAG, "Manifest missing") }
   setOnSpineMissing { Log.e(ERROR_TAG, "Spine missing") }
   setOnTableOfContentsMissing { Log.e(ERROR_TAG, "Table of contents missing") }
}

Displaying information about missing attributes

Our parsing method can return unexpected results also when the set of attributes in the file structure element is not complete
e.g. missing language attribute in Metadata element.

Solution – onAttributeMissing
The mechanism that we created is the answer to the problem illustrated above and it is the part of ValidationListener.
When the required attribute is not correct or missing, our listener reports information with name of him and his parent.
As parameters, we receive two values:
parentElement – the name of the main element in which the error occurs
attributeName – name of the missing attribute

setOnAttributeMissing { parentElement, attributeName ->
    Log.e("$parentElement warn", "missing $attributeName attribute")
}

Information about epub format for non-developers

EPUB is an e-book file format that uses the “.epub” file extension. Its structure is based on the main elements, such as: Metadata, Manifest, Spine, Table of Contents.

Metadata – contains all metadata information for a specific EPUB file. Three metadata attributes are required (though many are still available):
title – contains the title of the book.
language – contains the language of the book,
identifier – contains the unique identifier of the book.

<metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
   <dc:title id="title">Title of the book</dc:title>
   <dc:language>en</dc:language>
   <dc:identifier id="pub-id">id-identifier</dc:identifier>

Manifest – element lists all the files. Each file is represented by an element, and has the required attributes:
id – id of the resource
href – location of the resource
media-type – type and format of the resource

Spine – element lists all the XHTML content documents in their linear reading order.

Table of contents – contains the hierarchical table of contents for the EPUB file.
A description of the full TOC specification can be found here:
TOC spec for EPUB 2.0
TOC spec for EPUB 3.0

Deprecation Warning

This library has been succeeded by the MX_V2 Library.
This library will remain available since it is very different structurally from the new one, but will not be updated or maintained.
Please migrate to the new one whenever possible.

MX_Alps_Hybrid

KiCad Libraries of keyboard switch footprints

Included Libraries

MX_Alps_Hybrid.pretty – The original MX/Alps hybrid support footprints.
- FLIPPED – Reversed LED pads for overlapping switch footprints.
- NoLED – No LED pads.
- ReversedStabilizers – Stabilizer mirrored vertically (i.e. for bottom row).
MX_Only.pretty – Only for Cherry MX and derivative clones.
- FLIPPED – Reversed LED pads for overlapping switch footprints.
- NoLED – No LED pads.
- ReversedStabilizers – Stabilizer mirrored vertically (i.e. for bottom row).
- Hotswap – Kailh hotswap sockets of both LED and non-LED variants.
ALPS_Only.pretty – Only for alps SKCM/SKCL, SKBM/SKBL, and clones with same pin structure.
- LED – Specifically for Alps SKCL with in-switch indicators.
Kailh_Choc.pretty – Only for Kailh Choc switches.

Features

Designed from scratch using official datasheets and accurate measurements
Various footprints for all occasions
Almost every switch size in existence
Topside soldermask to prevent solder overflow and improve appearance

Upgrading

The library was overhauled on June 1st, 2019 due to its aging structure and contents.

The schematic components were updated to work on the 50mil grid. You can replace the components; however, it will take a decent amount of work.
- If you wish to do this, remove the old schematic library, re-add the new one, and replace the schematic components.
The footprint library was divided into four distinct libraries. Remove the previous, re-add the libraries with the footprints you are using, then rebind the footprints in the schematic.

Request More Footprints

I’ll be more than happy to make more custom footprints to fit your needs, time permitting. I will admit that I’m definitely short on time nowadays, so I may not be able to respond right away.

Contributing

Feel free to create pull requests with more footprints. I only ask that they are of high quality, and that they are based on official dimensions, if possible.

Planet_Hop

Second video game coded in programming language ‘Lua’ and tested in LOVE2D. Pixel art done in GIMP.

HOW TO DOWNLOAD: Download the file by clicking on the GREEN “Code” buttom above. It will download all files in a zip folder. Unzip the folder, and inside you will see seven (7) items. You can delete the four (4) PNG image files, they are just screenshots from the game. You can also delete the folder “Planet Hop – Code” unless you would like to format the code for the game. The folder “Planet Hop – Final Game” is the folder with the playable game! Leave this folder in tact, you can move it to your desktop or any other location, as long as all eight (8) items stay within that folder. There should be seven (7) .dll files along with an .exe file titled Planet_Hop. Double click Planet_Hop.exe and enjoy!

INSTRUCTIONS/ HOW TO PLAY: Press ‘enter’ to start the game. After the timer counts down from 3, press the spacebar to make the spaceship “hop”. Time the pressing of the spacebar right so that you stay floating and don’t hit the alien apartments! See how far you can go!

This is what you see when you first start the game

This is when the game first starts

This is how the main gameplay looks like

This is what the Score Screen or end of the game looks like

Version 1

ENJOY!

Aman Hafeez Mechanical Engineer amanhaf@gmail.com amanhafeez.com

Planet_Hop

Second video game coded in programming language ‘Lua’ and tested in LOVE2D. Pixel art done in GIMP.

HOW TO DOWNLOAD: Download the file by clicking on the GREEN “Code” buttom above. It will download all files in a zip folder. Unzip the folder, and inside you will see seven (7) items. You can delete the four (4) PNG image files, they are just screenshots from the game. You can also delete the folder “Planet Hop – Code” unless you would like to format the code for the game. The folder “Planet Hop – Final Game” is the folder with the playable game! Leave this folder in tact, you can move it to your desktop or any other location, as long as all eight (8) items stay within that folder. There should be seven (7) .dll files along with an .exe file titled Planet_Hop. Double click Planet_Hop.exe and enjoy!

INSTRUCTIONS/ HOW TO PLAY: Press ‘enter’ to start the game. After the timer counts down from 3, press the spacebar to make the spaceship “hop”. Time the pressing of the spacebar right so that you stay floating and don’t hit the alien apartments! See how far you can go!

This is what you see when you first start the game

This is when the game first starts

This is how the main gameplay looks like

This is what the Score Screen or end of the game looks like

Version 1

ENJOY!

Aman Hafeez Mechanical Engineer amanhaf@gmail.com amanhafeez.com

Overview

Example / basic Next.js-based Release app for the MicroApps framework.

Screenshot

Try the App

Launch the App

Video Preview of the App

Functionality

Lists all deployed applications
Shows all versions and rules per application
Allows setting the default rule (pointer to version) for each application

Installation

Example CDK Stack that deploys @pwrdrvr/microapps-app-release:

Deploying the MicroAppsAppRelease CDK Construct on the MicroApps CDK Construct

The application is intended to be deployed upon the MicroApps framework and it operates on a DynamoDB Table created by the MicroApps framework. Thus, it is required that there be a deployment of MicroApps that can receive this application. Deploying the MicroApps framework and general application deployment instructions are covered by the MicroApps documentation.

The application is packaged for deployment via AWS CDK and consists of a single Lambda function that reads/writes the MicroApps DynamoDB Table.

The CDK Construct is available for TypeScript, DotNet, Java, and Python with docs and install instructions available on @pwrdrvr/microapps-app-release-cdk – Construct Hub.

Installation of CDK Construct

Node.js TypeScript/JavaScript

npm i --save-dev @pwrdrvr/microapps-app-release-cdk

Add the Construct to your CDK Stack

See cdk-stack for a complete example used to deploy this app for PR builds.

import { MicroAppsAppRelease } from '@pwrdrvr/microapps-app-release-cdk';

const app = new MicroAppsAppRelease(this, 'app', {
  functionName: `microapps-app-${appName}${shared.envSuffix}${shared.prSuffix}`,
  table: dynamodb.Table.fromTableName(this, 'apps-table', shared.tableName),
  nodeEnv: shared.env as Env,
  removalPolicy: shared.isPR ? RemovalPolicy.DESTROY : RemovalPolicy.RETAIN,
});

Transfer Learning for Anime Characters

Warning: This repository size is quite big (approx. 100 MB) since it includes training and test images.

Introduction

This repository is the continuation of Flag #15 – Image Recognition for Anime Characters.

In Flag #15, we can see that Transfer Learning works really well with 3 different anime characters: Nishikino Maki, Kotori Minami, and Ayase Eli.

In this experiment, we will try to push Transfer Learning further, by using 3 different anime characters which have hair color similarity: Nishikino Maki, Takimoto Hifumi, and Sakurauchi Riko.

This experiment has 3 main steps:

Utilize lbpcascade_animeface to recognize character face from each images
Resize each images to 96 x 96 pixels
Split images into training & test before creating the final model

raw directory contains 36 images for each characters (JPG & PNG format). The first 30 images are used for training while the last 6 images are used for test.

As an example, we got the following result after applying Step 1 (cropped directory is shown at the right side):

lbpcascade_animeface can detect character faces with an accuracy of around 83%. Failed images are stored in raw (unrecognized) for future improvements.

Since we have 3 characters and 6 test images for each which are not part of training, resized_for_test contains 18 images in total. Surprisingly, almost all characters are detected properly!

Update (Nov 13, 2017): See animeface-2009 section below, which push face detection accuracy to 93%.

Requirements

OpenCV (https://github.com/opencv/opencv)
TensorFlow (https://github.com/tensorflow/tensorflow)

Steps

The following command is used to populate cropped directory.

$ python bulk_convert.py raw/[character_name] cropped

The following command is used to populate resized_for_training & resized_for_test directory.

$ python bulk_resize.py cropped/[character_name] resized

After running the step above, you can decide how many images will be used in resized_for_training and how many images will be used in resized_for_test.

Re-train the Inception model by using transfer learning:

$ bazel-bin/tensorflow/examples/image_retraining/retrain --image_dir ~/transfer-learning-anime/resized_for_traning/
$ bazel build tensorflow/examples/image_retraining:label_image

At this point, the model is ready to use. We can run the following command to get the classification result:

$ bazel-bin/tensorflow/examples/image_retraining/label_image --graph=/tmp/output_graph.pb --labels=/tmp/output_labels.txt --output_layer=final_result:0 --image=$HOME/transfer-learning-anime/resized_for_test/[character name]/[image name]

If everything works properly, you will get the classification result. See TensorFlow Documentation for more options.

Optionally, sample model can be downloaded by running download_model.sh script inside models (example) directory.

Result Analysis

Initially, we run the experiment with 2 characters: Nishikino Maki and Takimoto Hifumi.

INFO:tensorflow:2017-11-10 08:50:36.151387: Step 3999: Train accuracy = 100.0%
INFO:tensorflow:2017-11-10 08:50:36.151592: Step 3999: Cross entropy = 0.002191
INFO:tensorflow:2017-11-10 08:50:36.210147: Step 3999: Validation accuracy = 100.0% (N=100)
INFO:tensorflow:Final test accuracy = 92.9% (N=14)

The result is as the following:

Image	Classification	OK/NG
	nishikino maki (score = 0.99874) takimoto hifumi (score = 0.00126)	OK
	nishikino maki (score = 0.75519) takimoto hifumi (score = 0.24481)	OK
	nishikino maki (score = 0.99513) takimoto hifumi (score = 0.00487)	OK
	nishikino maki (score = 0.98629) takimoto hifumi (score = 0.01371)	OK
	nishikino maki (score = 0.99723) takimoto hifumi (score = 0.00277)	OK
	nishikino maki (score = 0.99695) takimoto hifumi (score = 0.00305)	OK

Image	Classification	OK/NG
	takimoto hifumi (score = 0.63084) nishikino maki (score = 0.36916)	OK
	takimoto hifumi (score = 0.99728) nishikino maki (score = 0.00272)	OK
	takimoto hifumi (score = 0.99972) nishikino maki (score = 0.00028)	OK
	takimoto hifumi (score = 0.98852) nishikino maki (score = 0.01148)	OK
	takimoto hifumi (score = 0.99456) nishikino maki (score = 0.00544)	OK
	takimoto hifumi (score = 0.96630) nishikino maki (score = 0.03370)	OK

From the result above, 10 out of 12 have threshold > 0.95, while the lowest threshold is 0.63.

At this point, I decided to add Sakurauchi Riko, which is known for its similarity to Nishikino Maki.

INFO:tensorflow:2017-11-10 13:13:59.270717: Step 3999: Train accuracy = 100.0%
INFO:tensorflow:2017-11-10 13:13:59.270912: Step 3999: Cross entropy = 0.005526
INFO:tensorflow:2017-11-10 13:13:59.328139: Step 3999: Validation accuracy = 100.0% (N=100)
INFO:tensorflow:Final test accuracy = 80.0% (N=15)

With 3 similar characters, the result is as the following:

Image	Classification	OK/NG
	nishikino maki (score = 0.99352) sakurauchi riko (score = 0.00612) takimoto hifumi (score = 0.00036)	OK
	nishikino maki (score = 0.47391) sakurauchi riko (score = 0.37913) takimoto hifumi (score = 0.14696)	OK
	nishikino maki (score = 0.95976) sakurauchi riko (score = 0.02797) takimoto hifumi (score = 0.01227)	OK
	nishikino maki (score = 0.88851) sakurauchi riko (score = 0.07526) takimoto hifumi (score = 0.03623)	OK
	nishikino maki (score = 0.99025) sakurauchi riko (score = 0.00766) takimoto hifumi (score = 0.00209)	OK
	nishikino maki (score = 0.96782) sakurauchi riko (score = 0.02783) takimoto hifumi (score = 0.00435)	OK

As you can see above, the similarity between Nishikino Maki and Sakurauchi Miko starts to lower down the confidence level of the resulted model. Nevertheless, all classifications are still correct, where 4 out of 6 maintain the threshold of > 0.95.

Image	Classification	OK/NG
	takimoto hifumi (score = 0.86266) nishikino maki (score = 0.13632) sakurauchi riko (score = 0.00102)	OK
	takimoto hifumi (score = 0.87614) sakurauchi riko (score = 0.12334) nishikino maki (score = 0.00051)	OK
	takimoto hifumi (score = 0.99964) sakurauchi riko (score = 0.00023) nishikino maki (score = 0.00013)	OK
	takimoto hifumi (score = 0.99417) nishikino maki (score = 0.00472) sakurauchi riko (score = 0.00110)	OK
	takimoto hifumi (score = 0.94923) sakurauchi riko (score = 0.04842) nishikino maki (score = 0.00235)	OK
	takimoto hifumi (score = 0.96029) sakurauchi riko (score = 0.02822) nishikino maki (score = 0.01150)	OK

Interestingly, the addition of 3rd character increases the confidence level of several Takimoto Hifumi testcases (see 1st and 4th result). Overall, this character can be easily differentiated compared to the other two.

Image	Classification	OK/NG
	sakurauchi riko (score = 0.98747) takimoto hifumi (score = 0.01054) nishikino maki (score = 0.00199)	OK
	sakurauchi riko (score = 0.96840) takimoto hifumi (score = 0.02895) nishikino maki (score = 0.00265)	OK
	sakurauchi riko (score = 0.97713) nishikino maki (score = 0.02167) takimoto hifumi (score = 0.00119)	OK
	sakurauchi riko (score = 0.90159) nishikino maki (score = 0.06989) takimoto hifumi (score = 0.02852)	OK
	sakurauchi riko (score = 0.99713) takimoto hifumi (score = 0.00184) nishikino maki (score = 0.00103)	OK
	sakurauchi riko (score = 0.79957) nishikino maki (score = 0.19310) takimoto hifumi (score = 0.00733)	OK

From this experiment, it seems that the current bottleneck is located at Step 1 (face detection), which have the overall accuracy of 83% in face detection.

animeface-2009

nagadomi/animeface-2009 provides another method of face detection. 13 out of 21 unrecognized images are now recognized in cropped (unrecognized) directory.

Current found limitations: it seems the script requires more memory and slower to run compared to lbpcascade_animeface.xml.

Image	Classification	OK/NG
	nishikino maki (score = 0.99296) sakurauchi riko (score = 0.00694) takimoto hifumi (score = 0.00010)	OK
	nishikino maki (score = 0.93702) sakurauchi riko (score = 0.04017) takimoto hifumi (score = 0.02281)	OK
	nishikino maki (score = 0.99406) sakurauchi riko (score = 0.00565) takimoto hifumi (score = 0.00030)	OK

Image	Classification	OK/NG
	takimoto hifumi (score = 0.99242) nishikino maki (score = 0.00431) sakurauchi riko (score = 0.00327)	OK
	takimoto hifumi (score = 0.99596) sakurauchi riko (score = 0.00403) nishikino maki (score = 0.00001)	OK
	takimoto hifumi (score = 0.98369) sakurauchi riko (score = 0.01498) nishikino maki (score = 0.00133)	OK
	takimoto hifumi (score = 0.99796) sakurauchi riko (score = 0.00189) nishikino maki (score = 0.00015)	OK
	takimoto hifumi (score = 0.99601) nishikino maki (score = 0.00335) sakurauchi riko (score = 0.00064)	OK
	takimoto hifumi (score = 0.99960) sakurauchi riko (score = 0.00029) nishikino maki (score = 0.00011)	OK
	takimoto hifumi (score = 0.99995) nishikino maki (score = 0.00004) sakurauchi riko (score = 0.00001)	OK

Image	Classification	OK/NG
	sakurauchi riko (score = 0.84480) nishikino maki (score = 0.12101) takimoto hifumi (score = 0.03419)	OK
	sakurauchi riko (score = 0.94310) nishikino maki (score = 0.04296) takimoto hifumi (score = 0.01393)	OK
	sakurauchi riko (score = 0.96176) takimoto hifumi (score = 0.03217) nishikino maki (score = 0.00607)	OK

Since this method gives better result in detecting anime character face and classification still works with almost the same result, the overall face detection accuracy is now around 93%.

License

is created by nagadomi/lbpcascade_animeface.

Copyright for all images are owned by their respective creators.