diff --git a/week7_assignment_submit_JIMIN.ipynb b/week7_assignment_submit_JIMIN.ipynb
new file mode 100644
index 0000000..0380791
--- /dev/null
+++ b/week7_assignment_submit_JIMIN.ipynb
@@ -0,0 +1,428 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "1. 평균 이동과 K-평균의 차이점을 설명하고 평균 이동의 장점/단점을 설명하시오.\n",
+    "\n",
+    "평균 이동은 K-평균과 유사하게 중심을 군집의 중심으로 지속적으로 움직이면서 군집화를 수행한다는 공통점이 있다.\n",
+    "그러나 K-평균이 중심에 소속된 데이터의 평균 거리 중심으로 이동하는 데 반해, 평균 이동은 중심을 데이터가 모여 있는 밀도가 가장 높은 곳으로 이동시킨다는 차이점이 있다.\n",
+    "\n",
+    "K-평균\n",
+    "장점: 매우 쉽고 직관적\n",
+    "단점: 복잡한 구조를 가지는 데이터 세트에 적용하기에는 한계가 있고 군집의 개수를 최적화시키기 어렵다.\n",
+    "\n",
+    "평균 이동\n",
+    "장점: 이미지나 영상 데이터에서 특정 개체를 구분하거나 움직임을 추적하는데 뛰어난 역할을 수행하는 알고리즘, 데이터 세트의 형태를 특정 형태로 가정한다는가, 특정 분포도 기반의 모델로 가정하지 않기 때문에 유연한 군집화 가능\n",
+    "단점: 수행시간이 오래걸리고 band-width의 크기에 따른 군집화 영향도가 매우 크다."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "2. 평균 이동에서 KDE의 대역폭 h값에 따른 KDE의 변화와 평균 이동 군집화 개수에 대해 설명하시오.\n",
+    "\n",
+    "대역폭 h는 KDE 형태를 부드러운 형태로 평활화하는데 사용이 되며, 이것에 따라 확률 밀도 추정 성능을 크게 좌우할 수 있다.\n",
+    "작은 h 값(1)은 좁고 뾰족한 KDE를 가지게되며, 과적합이 되기 쉽고 많은 수의 군집 중심점을 가진다.\n",
+    "큰 h 값(10)은 과도하게 퍙활화된 KDE로 인해 과소적합되기가 쉬우며 적은 수의 군집 중심점을 가진다."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "3. GMM의 군집화 방식에 대해 설명하시오.\n",
+    "\n",
+    "GMM 군집화는 군집화를 적용하고자 하는 데이터가 여러 개의 가우시안 분포를 가진 데이터 집합들이 섞여서 생성된 것이라는 가정하에 군집화를 수행하는 방식이다."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "4. GMM과 K-평균의 차이점에 대해 설명하시오.\n",
+    "\n",
+    "KMeans는 원형의 범위에서 군집화를 수행하므로 데이터 세트가 원형의 범위를 가질 수록 군집화 효율이 높아진다.\n",
+    "\n",
+    "GMM의 경우는 정규분포 군집화 방식을 사용해, K평균보다 유연하게 다양한 데이터 세트에 잘 적용될 수 있지만 시간이 오래걸린다는 단점이 있다."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "5. DBSCAN의 가장 중요한 파라미터 두 개의 대해 설명하고 DBSCAN의 군집화 방식에 대해 설명하시오.\n",
+    "\n",
+    "DBSCAN는 밀도 기반 군집화의 대표적인 알고리즘으로 특정 공간 내에서 밀도 차이를 기반 알고리즘으로 하고 있어서 복잡한 기하학적 분포도를 가진 데이터 세트에 대해서도 군집화를 잘 수행한다.\n",
+    "\n",
+    "가장 중요한 두 가지 파라미터는 입실론과 최소 데이터 개수이다.\n",
+    "\n",
+    "입실론 주변 영역: 개별 데이터를 중심으로 입실론 반경을 가지는 원형의 영역\n",
+    "\n",
+    "최소 데이터 개수: 개별 데이터의 입실론 주변 영역에 포함되는 타 데이터의 개수"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "6. 텍스트 전처리 작업들에 대해 설명하시오(ex 클렌징, 토큰화...). \n",
+    "\n",
+    "클렌징: 텍스트에서 분석에 오히려 방해가 되는 불필요한 문자, 기호등을 사전에 제거하는 작업\n",
+    "\n",
+    "토큰화: 문서에서 문장을 분리하는 문장 토큰화, 문장에서 단어를 토큰으로 분리하는 단어 토큰화\n",
+    "\n",
+    "스톱 워드 제거: 분석에 큰 의미가 없는 단어를 제거 (문법적으로는 의미가 있지만 의미적으로는 의미가 없는 단어들)\n",
+    "\n",
+    "Stemming, Lemmatization: 문법적, 의미적으로 변화하는 단어의 원형을 찾는것, Lemmatization가 성능이 더 좋다\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "7. BOW 피처 벡터화 중 카운트 기반 벡터화와 TF-IDF기반 벡터화의 차이점에 대해 설명하시오.\n",
+    "\n",
+    "카운트 기반 벡터화: 단어 피처에 값을 부여할 때 각 문서에서 해당 단어가 나타나는 횟수, 즉 Count를 부여하는 경우\n",
+    "\n",
+    "TF-IDF: 개별 문서에서 자주 나타나는 단어에 높은 가중치를 주되, 모든 문서에서 전반적으로 자주 나타나는 단어에 대해서는 페널치를 주는 방식으로 값을 부여한다."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "8. 첨부된 데이터를 이용하여 텍스트 분석 실습 하시오.\n",
+    "데이터 링크는 다음과 같다.\n",
+    "https://www.kaggle.com/seriousran/appletwittersentimenttexts"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "(1) CountVectorizer로 피처 벡터화, LogisticRegression으로 모델 학습/예측/평가"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "import pandas as pd\n",
+    "import matplotlib.pyplot as plt\n",
+    "%matplotlib inline\n",
+    "\n",
+    "from sklearn.model_selection import train_test_split\n",
+    "from sklearn.metrics import accuracy_score, precision_score, recall_score"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "<class 'pandas.core.frame.DataFrame'>\n",
+      "RangeIndex: 1630 entries, 0 to 1629\n",
+      "Data columns (total 2 columns):\n",
+      " #   Column     Non-Null Count  Dtype \n",
+      "---  ------     --------------  ----- \n",
+      " 0   text       1630 non-null   object\n",
+      " 1   sentiment  1630 non-null   int64 \n",
+      "dtypes: int64(1), object(1)\n",
+      "memory usage: 25.6+ KB\n"
+     ]
+    }
+   ],
+   "source": [
+    "appDF=pd.read_csv('apple.csv')\n",
+    "appDF.info()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>text</th>\n",
+       "      <th>sentiment</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>Wow. Yall needa step it up @Apple RT @heynyla:...</td>\n",
+       "      <td>-1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>What Happened To Apple Inc?   http://t.co/FJEX...</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>Thank u @apple I can now compile all of the pi...</td>\n",
+       "      <td>1</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>The oddly uplifting story of the Apple co-foun...</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>@apple can i exchange my iphone for a differen...</td>\n",
+       "      <td>0</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "                                                text  sentiment\n",
+       "0  Wow. Yall needa step it up @Apple RT @heynyla:...         -1\n",
+       "1  What Happened To Apple Inc?   http://t.co/FJEX...          0\n",
+       "2  Thank u @apple I can now compile all of the pi...          1\n",
+       "3  The oddly uplifting story of the Apple co-foun...          0\n",
+       "4  @apple can i exchange my iphone for a differen...          0"
+      ]
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "appDF.head(5)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "Index(['text', 'sentiment'], dtype='object')"
+      ]
+     },
+     "execution_count": 10,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "appDF.keys()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "X = appDF['text']\n",
+    "y = appDF['sentiment']\n",
+    "\n",
+    "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "학습 데이터 Text의 CountVectorizer Shape: (1304, 4464)\n"
+     ]
+    }
+   ],
+   "source": [
+    "from sklearn.feature_extraction.text import CountVectorizer\n",
+    "\n",
+    "# Count Vectorization으로 feature extraction 변환 수행. \n",
+    "cnt_vect = CountVectorizer()\n",
+    "# 개정판 소스 코드 변경(2019.12.24)\n",
+    "cnt_vect.fit(X_train)\n",
+    "X_train_cnt_vect = cnt_vect.transform(X_train)\n",
+    "\n",
+    "# 학습 데이터로 fit( )된 CountVectorizer를 이용하여 테스트 데이터를 feature extraction 변환 수행. \n",
+    "X_test_cnt_vect = cnt_vect.transform(X_test)\n",
+    "\n",
+    "print('학습 데이터 Text의 CountVectorizer Shape:',X_train_cnt_vect.shape)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "CountVectorized Logistic Regression 의 예측 정확도는 0.825\n"
+     ]
+    }
+   ],
+   "source": [
+    "from sklearn.linear_model import LogisticRegression\n",
+    "from sklearn.metrics import accuracy_score\n",
+    "\n",
+    "# LogisticRegression을 이용하여 학습/예측/평가 수행. \n",
+    "lr_clf = LogisticRegression()\n",
+    "lr_clf.fit(X_train_cnt_vect , y_train)\n",
+    "pred = lr_clf.predict(X_test_cnt_vect)\n",
+    "print('CountVectorized Logistic Regression 의 예측 정확도는 {0:.3f}'.format(accuracy_score(y_test,pred)))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "(2) TF-IDF로 피처 벡터화, LogisticRegression으로 모델 학습/예측/평가"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 25,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "TF-IDF Logistic Regression 의 예측 정확도는 0.804\n"
+     ]
+    }
+   ],
+   "source": [
+    "from sklearn.feature_extraction.text import TfidfVectorizer\n",
+    "\n",
+    "# TF-IDF Vectorization 적용하여 학습 데이터셋과 테스트 데이터 셋 변환. \n",
+    "tfidf_vect = TfidfVectorizer()\n",
+    "tfidf_vect.fit(X_train)\n",
+    "X_train_tfidf_vect = tfidf_vect.transform(X_train)\n",
+    "X_test_tfidf_vect = tfidf_vect.transform(X_test)\n",
+    "\n",
+    "# LogisticRegression을 이용하여 학습/예측/평가 수행. \n",
+    "lr_clf = LogisticRegression()\n",
+    "lr_clf.fit(X_train_tfidf_vect , y_train)\n",
+    "pred = lr_clf.predict(X_test_tfidf_vect)\n",
+    "print('TF-IDF Logistic Regression 의 예측 정확도는 {0:.3f}'.format(accuracy_score(y_test ,pred)))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "TF-IDF Vectorized Logistic Regression 의 예측 정확도는 0.776\n"
+     ]
+    }
+   ],
+   "source": [
+    "# stop words 필터링을 추가하고 ngram을 기본(1,1)에서 (1,2)로 변경하여 Feature Vectorization 적용.\n",
+    "tfidf_vect = TfidfVectorizer(stop_words='english', ngram_range=(1,2), max_df=300 )\n",
+    "tfidf_vect.fit(X_train)\n",
+    "X_train_tfidf_vect = tfidf_vect.transform(X_train)\n",
+    "X_test_tfidf_vect = tfidf_vect.transform(X_test)\n",
+    "\n",
+    "lr_clf = LogisticRegression()\n",
+    "lr_clf.fit(X_train_tfidf_vect , y_train)\n",
+    "pred = lr_clf.predict(X_test_tfidf_vect)\n",
+    "print('TF-IDF Vectorized Logistic Regression 의 예측 정확도는 {0:.3f}'.format(accuracy_score(y_test ,pred)))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 30,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Fitting 3 folds for each of 5 candidates, totalling 15 fits\n",
+      "Logistic Regression best C parameter : {'C': 12}\n",
+      "TF-IDF Vectorized Logistic Regression 의 예측 정확도는 0.819\n"
+     ]
+    }
+   ],
+   "source": [
+    "from sklearn.model_selection import GridSearchCV\n",
+    "\n",
+    "# 최적 C 값 도출 튜닝 수행. CV는 3 Fold셋으로 설정. \n",
+    "params = { 'C':[12, 12.1, 12.4, 12.6, 12.8]}\n",
+    "grid_cv_lr = GridSearchCV(lr_clf ,param_grid=params , cv=3 , scoring='accuracy' , verbose=1 )\n",
+    "grid_cv_lr.fit(X_train_tfidf_vect , y_train)\n",
+    "print('Logistic Regression best C parameter :',grid_cv_lr.best_params_ )\n",
+    "\n",
+    "# 최적 C 값으로 학습된 grid_cv로 예측 수행하고 정확도 평가. \n",
+    "pred = grid_cv_lr.predict(X_test_tfidf_vect)\n",
+    "print('TF-IDF Vectorized Logistic Regression 의 예측 정확도는 {0:.3f}'.format(accuracy_score(y_test ,pred)))"
+   ]
+  }
+ ],
+ "metadata": {
+  "interpreter": {
+   "hash": "97ae724bfa85b9b34df7982b8bb8c7216f435b92902d749e4263f71162bea840"
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.8.8"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}