Skip to content
This repository was archived by the owner on Jan 10, 2025. It is now read-only.

Commit 34fed4d

Browse files
author
kephircheek
authored
Merge pull request #30 from kephircheek/bugfix
Bugfix #2 #4 #13 #15 #19
2 parents a4e8dbc + 411fbbd commit 34fed4d

File tree

2 files changed

+66
-28
lines changed

2 files changed

+66
-28
lines changed

bibliography.bib

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1086,4 +1086,37 @@ @Article{pyrkov2019
10861086
publisher = {{MDPI} {AG}},
10871087
}
10881088
1089+
@InProceedings{weikang2016,
1090+
author = {Rui, Weikang and Xing, Kai and Jia, Yawei},
1091+
editor = {Lehner, Franz and Fteimi, Nora},
1092+
title = {BOWL: Bag of Word Clusters Text Representation Using Word Embeddings},
1093+
booktitle = {Knowledge Science, Engineering and Management},
1094+
year = {2016},
1095+
publisher = {Springer International Publishing},
1096+
address = {Cham},
1097+
pages = {3--14},
1098+
abstract = {The text representation is fundamental for text mining and information retrieval. The Bag Of Words (BOW) and its variants (e.g. TF-IDF) are very basic text representation methods. Although the BOW and TF-IDF are simple and perform well in tasks like classification and clustering, its representation efficiency is extremely low. Besides, word level semantic similarity is not captured which results failing to capture text level similarity in many situations. In this paper, we propose a straightforward Bag Of Word cLusters (BOWL) representation for texts in a higher level, much lower dimensional space. We exploit the word embeddings to group semantically close words and consider them as a whole. The word embeddings are trained on a large corpus and incorporate extensive knowledge. We demonstrate on three benchmark datasets and two tasks, that BOWL representation shows significant advantages in terms of representation accuracy and efficiency.},
1099+
isbn = {978-3-319-47650-6}
1100+
}
1101+
@inproceedings{appiah2009,
1102+
doi = {10.1109/ijcnn.2009.5179001},
1103+
url = {https://doi.org/10.1109%2Fijcnn.2009.5179001},
1104+
year = 2009,
1105+
month = {jun},
1106+
publisher = {{IEEE}},
1107+
author = {Kofi Appiah and Andrew Hunter and Hongying Meng and Shigang Yue and Mervyn Hobden and Nigel Priestley and Peter Hobden and Cy Pettit},
1108+
title = {A binary Self-Organizing Map and its {FPGA} implementation},
1109+
booktitle = {2009 International Joint Conference on Neural Networks}
1110+
}%
1111+
1112+
@inproceedings{santana2017,
1113+
doi = {10.1109/ijcnn.2017.7966174},
1114+
url = {https://doi.org/10.1109%2Fijcnn.2017.7966174},
1115+
year = 2017,
1116+
month = {may},
1117+
publisher = {{IEEE}},
1118+
author = {Alessandra Santana and Alessandra Morais and Marcos G. Quiles},
1119+
title = {An alternative approach for binary and categorical self-organizing maps},
1120+
booktitle = {2017 International Joint Conference on Neural Networks ({IJCNN})}
1121+
}%
10891122
@Comment{jabref-meta: databaseType:bibtex;}

manuscript.tex

Lines changed: 33 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -156,46 +156,53 @@ \subsection{The classical algorithm}
156156
The SOFM is one of the most widely-used unsupervised learning methods used in various areas of modern science.
157157
It was first proposed by Kohonen as a self-organizing unsupervised learning algorithm which produces feature maps similar to those occurring in the brain \cite{solan2001}.
158158
The SOFM algorithm operates with a set of input objects, each represented by a $N$-dimensional vector,
159-
and describes a mapping from a higher-dimensional input space to a lower-dimensional map space.
159+
and describes a mapping from a higher-dimensional input space to a lower-dimensional map space, commonly a bi-dimensional map.
160160

161161
The input dimensions are associated with the features,
162162
and the nodes in the grid (called cluster vectors) are assigned the $N$-dimensional vectors.
163163
The components of these vectors are usually called weights.
164-
Initially the weight components are chosen randomly.
165-
We then can train our SOFM adjusting the components through the learning process which occur in the two basic procedures of selecting a winning cluster vector and updating its weights (Fig.~\ref{fig:sofm_fitting}).
166-
More specifically, they consist of four step process: \begin{enumerate*}
164+
Initially the weight components are chosen randomly
165+
and topological distances between neurons given.
166+
We then can train our SOFM adjusting the components through the learning process which occur in the two basic procedures of
167+
selecting a winning cluster vector, also called the best matching unit (BMU), and updating its weights (Fig.~\ref{fig:sofm_fitting}).
168+
More specifically, they consist of four step process:
169+
\begin{enumerate*}
167170
\item selecting an input vector randomly from the set of all input vectors;
168-
\item finding a cluster vector which is closest to the input vector;
169-
\item adjusting the weights of the winning node in such a way that it becomes even closer to the input vector;
171+
\item finding a cluster vector which is closest to the input vector (BMU);
172+
\item adjusting the weights of the BMU and neurons close to it on feature map in such a way
173+
that these vectors becomes even closer to the input vector;
170174
\item repeating this process for many iterations until it converges.
171175
\end{enumerate*}
172176

173177

174-
After the winning cluster vector is selected, the weights of the vector are adjusted according to
178+
On a step $t$ when the BMU $w_{c}$ is selected,
179+
the weights of the BMU and its neigbours on feature map are adjusted according to
175180
%
176181
\begin{equation}
177-
\vec w_{i+1} =
178-
\vec w_i
179-
+ \alpha\left(\vec{x} - \vec w_i\right).
180182
\label{eq:learning}
183+
\vec{w_{i}}(t + 1)
184+
= \vec{w_i}(t)
185+
+ \theta(c, i, t) \alpha(t)
186+
\left(\vec{x}(t) - \vec{w_i}(t)\right),
181187
\end{equation}
182188
%
183-
The above expression can be interpreted according to:
184-
if a component of the input vector $\vec{x}$ is greater than the corresponding weight $ \vec{w}_i $,
185-
increase the weight by a small amount with the learning rate $\alpha$;
186-
if the input component is smaller than the weight, decrease the weight by a small amount.
187-
The larger the difference between the input component and the weight component, the larger the increment (decrement).
189+
where $\alpha(t)$ is the learning rate and $\theta(c, i, t)$ is the neighbor function,
190+
which defines the topological neighbor neurons to be updated.
191+
Note that neighbor function depend on distance on feature map given initialy,
192+
but not distance metrics between vectors.
193+
188194
Intuitively, this procedure can be geometrically interpreted as iteratively moving the cluster vectors in space one at a time in a way
189195
that ensures each move is following the current trends inferred from their distances to the input objects.
190196
A visualisation of this process is shown in Fig. \ref{fig:sofm_fitting}.
191197

192-
Usually the winning cluster vector is selected based on the Euclidean distance between an input vector and the cluster vectors.
193-
In our approach, we use the Hamming distance instead of the Euclidean distance to select the winning cluster vector.
194-
It allows us to use a simpler encoding of the classical information into the quantum state and use an effective procedure for the calculation of the Hamming distance on the quantum machine,
195-
such as to reduce the number of calculations in number of cluster vectors in comparison to the classical case.
196-
197-
198-
198+
In SOFM original version was designed to cluster real-valued data
199+
and the winning cluster vector is selected based on the Euclidean distance between an input vector and the cluster vectors.
200+
The paper deals with binary vectors clustering problem
201+
and for binary data the Hamming distance is more suitable \cite{appiah2009, santana2017}.
202+
Using a known technique of encoding classical information into a quantum register,
203+
based on probabilistic quantum memories \cite{trugenberger2001},
204+
we introduce an optimized algorithm for calculating the matrix of Hamming distances
205+
between each pair of binary vectors of two sets taking advantage of quantum parallelism.
199206

200207

201208

@@ -245,9 +252,6 @@ \subsection{Optimized quantum scheme for Hamming distance calculation}
245252

246253

247254

248-
We now introduce an optimized algorithm for calculating the matrix of Hamming distances \cite{trugenberger2001} between a sample vector and all cluster vectors, making use of quantum parallelism.
249-
250-
%This allows for a simple encoding of the classical information into a quantum register.
251255

252256
The overall procedure involves two registers of $n$ qubits each, denoted $\left| X \right\rangle$ and $\left| Y \right\rangle$, along with a single auxiliary qubit $\left| a \right\rangle$.
253257
During the whole process, the $\left| Y \right\rangle$ register is used to store the cluster states.
@@ -291,7 +295,8 @@ \subsection{Optimized quantum scheme for Hamming distance calculation}
291295
instead it stores the information about pairwise different qubits between the input vector $\{X\}$ and cluster vector $\{Y\}$.
292296
Next, for each pair $\{X\}$ and $\{Y\}$, the accumulated information of all the differences is projected onto the amplitude of the superposed state.
293297
This is achieved by applying the Hadamard gate on auxiliary qubit,
294-
followed by a controlled rotation around $z$-axis gate on $\left| Xa \right\rangle$ defined as
298+
followed by a controlled rotation around z-axis gate~(\ref{eq:controled_rotation}) on $\left|a d_{ij}^{(\alpha)}\right\rangle$
299+
where $d_{ij}^{(\alpha)}$ is the control qubit and $\left| a \right\rangle$ ancilla qubit register is the target.
295300
%
296301
\begin{equation}
297302
\label{eq:controled_rotation}
@@ -440,7 +445,7 @@ \subsection{Optimized quantum scheme for Hamming distance calculation}
440445
\begin{figure}[t]
441446
\includegraphics[width=0.95\columnwidth]{vectorized_sample.png}
442447
\caption{
443-
(a) Representation of the data set of abstracts with the bag-of-words model is shown.
448+
(a) Representation of the data set of abstracts with the bag-of-words \cite{weikang2016} model is shown.
444449
Each abstract is represented by a binary vector with 9 elements, corresponding to the 9 words on the horizontal axis.
445450
The samples are sorted into groups (QML, MED, BIO) with 3 papers for each tag, for a total of 9 paper.
446451
(b) The Hamming distance between each vectorized abstract is shown as a number in the matrix.
@@ -478,7 +483,7 @@ \section{Experimental demonstration of QASOFM}
478483
``Quantum Machine Learning'' (QML),
479484
``Cancer'' (MED)
480485
and ``Gene Expression'' (BIO).
481-
Abstracts were vectorized by the bag-of-words model in order to choose most defining words in each data set (see Fig.~\ref{fig:vectorized_sample}) \cite{mctear2016}.
486+
Abstracts were vectorized by the bag-of-words \cite{weikang2016} model in order to choose most defining words in each data set (see Fig.~\ref{fig:vectorized_sample}) \cite{mctear2016}.
482487
This model represents text as a multiset ``bag'' of its words taking into account only multiplicity of words.
483488
Preparing the bag-of-words we excluded the words that appear only in one abstract and more than in 4 abstracts and we also excluded the word ``level'' from consideration due to the frequent overlap between the clusters because it gives instabilities for both classical and quantum algorithms.
484489
We restricted our bag-of-word size to 9 of the most frequent words from the full bags-of-word due to limitations of the number of qubits.

0 commit comments

Comments
 (0)