1. -------- is a triplet summary information about sub clusters of objects<===>clustering features

2. _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _are the techniques for approving overall classifier accuracy by learning and combining series of individual classifiers.<===>bagging & boosting

3. _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ are the two major types of prediction problems.<===>classification-regression

4. _ _ _ _ _ _ _ _ _ _ _ _ is a signal processing technique that decomposes a signal into different frequency sub bands.<===>Wavelet transform

5. _ _ _ _ _ _ _ _ _ _ _ are modified so as to minimize the mean squared error b/w the networks prediction and the actual class<===>weights of the nodes

6. _ _ _ _ _ _ _ _ _ _ _ is a two step process<===>data classification

7. _ _ _ _ _ _ _ _ _ _ _ methods use statistical measures to remove the least reliable<===>tree pruning

8. _ _ _ _ _ _ _ _ _ _ _ performs multidimensional clustering in 2 steps.<===>CLIQUE

9. _ _ _ _ _ _ _ _ _ _ _ verifies whether an object is significantly large or small in relation to the distribution F.<===>discordancy test

10. _ _ _ _ _ _ _ _ _ _ is a collection of pointers to spatial objects.<===>spatial measures

11. _ _ _ _ _ _ _ _ _ _ is used to identify outliers w.r.t the model.<===>discordancy test

12. _ _ _ _ _ _ _ _ _ _ methods quantize the object space into a finite number of cells that form a grid structure.<===>grid-based methods

13. _ _ _ _ _ _ _ _ _ _ notation be used to represent sequence of actions of the same type.<===>+

14. _ _ _ _ _ _ _ _ _ _ uses the concept to generalize the data by replacing lower-level data with high-level concepts.<===>attribute oriented induction

15. _ _ _ _ _ _ _ _ _ and _ _ _ _ _ _ _ _ _ are two operators of COBWEB.<===>merging, splitting

16. _ _ _ _ _ _ _ _ _ can be used to find the most "natural" number of clusters using a silhouette coefficient.<===>CLARANS

17. _ _ _ _ _ _ _ _ _ doesn't require a metric distance between the objects.<===>dissimilarity f unction

18. _ _ _ _ _ _ _ _ _ find all pairs of gap free windows of a small length that are similar.<===>atomic matching

19. _ _ _ _ _ _ _ _ _ is a dimension whose primitive level data are spatial starting at a certain level, becomes non-spatial.<===>spatial to non-spatial dimensions

20. _ _ _ _ _ _ _ _ _ is a task of mining significant patterns from a plan base.<===>plan mining

21. _ _ _ _ _ _ _ _ _ is used to access the percentage of samples.<===>precision

22. _ _ _ _ _ _ _ _ _ is used to improve the efficient of the Apriori algorithm.<===>Iceberg queries

23. _ _ _ _ _ _ _ _ _ refers to the preprocessing of data in order to remove noise.<===>data cleaning

24. _ _ _ _ _ _ _ _ _ stores and manages a large collection of multimedia objects.<===>multimedia database system

25. _ _ _ _ _ _ _ _ and _ _ _ _ _ _ _ _ _ are important means of generalization .<===>aggregation, approximation

26. _ _ _ _ _ _ _ _ can be modeled by adding polynomial terms to the basic linear mode.<===>polynomial regression

27. _ _ _ _ _ _ _ _ is a optimization method for spatial association analysis.<===>progressive refinement

28. _ _ _ _ _ _ _ _ is one where properties can be inherited from more than one super class.<===>multiple inheritance

29. _ _ _ _ _ _ _ _ is the database that consists of sequence of ordered events with / without concrete notions of time.<===>sequence database

30. _ _ _ _ _ _ _ _ refer to the cycles which are long term oscillations about a trend line/curve which may/may not be periodic.<===>cyclic movements

31. _ _ _ _ _ _ _ _ regression can be modeled by adding polynomial terms to the basic linear model<===>polynomial

32. _ _ _ _ _ _ _ can automatically result in the removal of outliers.<===>Wavelet transform

33. _ _ _ _ _ _ _ compete in a "winner-take-all" fashion for the object that is currently presented to the system<===>competitive learning

34. _ _ _ _ _ _ _ has the ability to automatically adjust the number of classes in a partition.<===>COBWEB

35. _ _ _ _ _ _ _ is a density-based method that computers an augmented clustering.<===>OPTICS

36. _ _ _ _ _ _ _ is a simple technique that uses a test set of class-labeled samples.<===>holdout method

37. _ _ _ _ _ are used to incorporate ideas if natural evolution<===>genetic algorithms

38. _ _ _ _ if(x, age > 20) Λ if(x, percentage >= 70) then placement = approved _ _ _ _ _ _ _ is defined in terms of Euclidean distance<===>Closeness between 2 points

39. _ _ _ _ refers to the extraction of knowledge spatial relationships not explicitly stored in spatial databases.<===>spatial data mining

40. A _ _ _ _ _ _ _ _ _ _ has complex tasks, graphics, images, videos, maps, voice, music etc.<===>multimedia database

41. A _ _ _ _ _ _ _ _ _ resembles a nominal variable.<===>discrete ordinal variable

42. A constraint "max(I.marks) >=600" is acceptable for _ _ _ _ _ _ & _ _ _ _ _ _ _categories.<===>monotone,succinct

43. A constraint such as "avg(I.marks) <= 70" is not a(n) _ _ _ _ _ _ _ _ _ _ _<===>anti-monotone

44. A huge amount of space-related data are in _ _ _ _ _ _ _ _ _ forms.<===>images

45. A Neural Network containing N hidden Layers is called as _ _ _ _ _ _ _ _ _ _ Neural network layered<===>(N+1)

46. A set of items is referred to as a(n) _ _ _ _ _ _ _ _ _ _<===>itemset

47. A set-valued attribute may be _ _ _ _ _ _ _ i) homogenous ii)heterogeneous<===>i or ii

48. Accuracy estimates also help in _ _ _ _ _ _ _ _ _<===>comparison of different classifiers

49. Accuracy is given as _ _ _ _ _ _ _ _ _ _<===>specificity * (neg/(pos+neg)) + sensitivity *(pos/(pos+neg))

50. AGNES is expanded to _ _ _ _ _ _ _ _<===>agglomerative nesting

51. Anti-monotone states _ _ _ _ _ _ _ _<===>if a set cannot pass a test, all its supersets also cannot pass the same test

52. Anti-monotone, monotone, succinct, convertible and inconvertible are five different categories of _ _ _ _ _ _ _ _ _ _ constraints.<===>rule constraints

53. Apart from prediction, the log linier model is also useful for _ _ _ _ _<===>data compression

54. Apriori algorithm employs level-wise search, where k-itemsets uses ------ itemsets.<===>(k+1)

55. Assume the fallow data X(years experience ) y( salary) 02 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 13000 09 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 45000 15 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 75000 03 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 18000 Its β value is _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<===>4698.6

56. Assume the following salary details: X(years experience) Y(salary in Rs.)2 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 13000 9 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 45000 15 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 75000 4 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 18000 The value of x = _ _ _ _ _ _ _ _ _<===>8.5

57. Back propagation is a neural n/w _ _ _ _ _ _ _ _ _ _ _ algorithm<===>learning

58. Bayes theorem provides a way of calculating which probability<===>posterior

59. Block and consecutive procedures are 2 basic types for _ _ _ _ _ _ _ _ _ _ _<===>detecting outliers

60. Bootstrap is also known as _ _ _ _ _ _ _ _ _<===>bagging

61. car => financial from bank [loan=80 %, insurance=20 %] _ _ _ _ _ _ _ & _ _ _ _ _ are two measures of Association rules.<===>loan,insurance

62. car =>financial from bank [loan=80 %,insuranace=20 %] Association rules are satisfied if they have _ _<===>maximum loan threshold, minimum insurance threshold

63. CF tree has how many parameters?<===>2

64. Classification threshold is also called as _ _ _ _ _ _ _ _ _<===>precision threshold

65. CLASSIT and Auto class are _ _ _ _ _ _ _ _ _ _<===>statistical approaches

66. Clustering is a form of _ _ _ _ _ _ _ _ _ _ _<===>learning by observation

67. Clustering large applications can be shortened as _ _ _ _ _<===>CLARA

68. Consider a spatial association rule Is-a(X,"office") Λ near-by(X,"house") => near-by(X,"university") [ 0.5 % 80 %] The rule states _ _ _ _ _ _ _ _ percent of the offices are close to houses.<===>80 %

69. Consider the data Original data: 3 7 2 0 7 2 Moving average of order 3: 4 3 2 6 Weighted(3,4,3) The first weighted average value is _ _ _ _ _ _ _ _ _ _ _<===>4.3

70. Consider the following rile Age(A,"18,19, _ _ _ _ _ 29") Λ placement(A,"Infosys,IBM, _ _ _ _ ") Λ purchases(A,"mobile") => purchases(A,"high memory card") Λ purchases(A,"card reader") This rule is highlighted in saying that it has _ _ _ _ _ _ _ _<===>repetitive predicate

71. Consider the following rule: If an engineering student inWarangal bought "speech recognition CD" and "MS Office" and "jdk 1.7", it is likely (with a probability of 58 %) that the student also bought SQL Server and "My SQL Server" and 6.5 % of all the students bought all five. The meta rule can be generated in association rule as _ _ _ _<===>lives(S, _,Warangal") Λ sales(S,"speech recognition", _) Λ sales(S,"MS Office", _) Λ sales(S,"jdk 1.7", _) =>sales(S,"SQL Server", _) Λ sales(S,"My SQL Server", _) [6.5 % 58 %]

72. CPT stands for _ _ _ _ _ _ _ _ _ _ _<===>Conditional probability table

73. Data which are inconsistent with the remaining set of data is called as _ _ _ _<===>Outliers

74. DBSCAN is _ _ _ _ _ _ _ _ _ clustering algorithm<===>density-based methods

75. Decision trees can easily be converted to _ _ _ _ _ _ _ rules.<===>If-THEN

76. Density connectivity is a _ _ _ _ _ _ _ _ relation.<===>symmetric

77. DFT and DWT are two popular data-independent transformations where F stands for _ _ _ _ _ _ _ _ _ _ _ _<===>Fourier

78. During Bayesian n/w s incomplete data is referred to _ _ _ _ _ _ _ _<===>Hidden data

79. During the construction of decision tree induction the tree starts as _ _ _ _ _ _<===>single node

80. Each attribute to simple-valued data for constructing a multi-dimensional data cube are called as _ _ _ _ _ _ _ _ _ _ _<===>object cube

81. Each object in a class is associated with _ _ _ _ _ _ _ _ _ _ _<===>object identifier & set of attributes

82. Early decision tree algorithms typically assume that the data is from _ _ _ _<===>memory

83. EM is expanded to _ _ _ _ _ _<===>expectation maximization

84. EP stands for _ _ _ _ _ _ _ _ _ _ _<===>emerging patterns

85. Erri = where are _ _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _<===>actual o/p, true o/p

86. For given n objects the complexity of CURE is _ _ _ _ _ _ _ _ _ _<===>O(n)

87. From the eg. Y= | x, the co-efficient a= ______________<===>y'-βx

88. Gaussian influence function is given as _ _ _ _ _ _ _ _ _ _ _<===>e-

89. Grid-based computation is _ _ _ _ _ _ _ _ _ _ _<===>query independent

90. If a k-dimensional unit is dense , then its projections are in _ _ _ _ _ _ _dimensional space.<===>(k-1)

91. If a rule concerns associations between the presence or absence of items, it is a _ _<===>Boolean association

92. If a rule describes association between quantitative attributes, it is a _ _ _ _ _ _ _ _ _<===>Quantitative

93. If a set of rules has same condent,then the rule with highest confidence is selected as _ _ _ to represent the set<===>possible rule

94. If a single distinct predicate exists in single dimensional association rule , it is also called as _ _ _<===>intra dimension association rule

95. If an arc is drawn from node A to node B then A is _ _ _ _ _ _ _ _ _ of B i)Parent ii)immediate predecessor iii)descendent iv)immediate successor<===>i & ii

96. If in multi dimensional association rule with repeated predicates, which contains multiple occurrences of some predicate certain rules are called as _ _ _ _ _ _ _ _ _<===>hybrid association rule

97. If in the item set {percent <= "59", placement ="no" } whose support increases from 0.7 % is c1 to 92.6 % in c2, the growth rate is _ _ _ _ _ _<===>92.6 %/0.7 %

98. If no repeated predicates exists in multi dimensional association rule is also called as<===>inter dimension association rule

99. If the time interval "int=0" means _ _ _ _ _ _ _ _ _ _<===>no interval gap is allowed

100. If the transactional data is The minimum support count is _ _ _ _ _ _<===>2

101. If there are 'm' number of objects within d-neighborhood of an outlier and later it is decided as not an outlier because of its _ _ _ _ _ _ _ _ _ number of neighbors.<===>(m+1)

102. If X=>Y[a=50 %,b=2 %] a(X=>Y) = _ _ _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _ and<===>P(AUB),P(B/A)

103. In _ _ _ _ _ _ _ _ _ signature, its image includes a composition of multiple features.<===>multi feature composed

104. In _ _ _ _ _ _ _ _ _ the class label of each training sample is not known, and the number or set of classes to be learned may not be known in advance.<===>unsupervised learning

105. In _ _ _ _ _ _ _ _ the class distribution of the samples in each fold is approximately the same as that in the initial data.<===>stratified cross-validation

106. In a decision tree, _ _ _ _ _ _ _ _ represents an outcome of the test.<===>branch

107. In a multilayer feed-forward NN the weighted output of hidden layer are inputs to _<===>output layer

108. In cell-based algorithm, if k represents dimensionality and c is a constant. Its complexity is defined as _ _ _ _ _ _ _ _ _ _<===>O(+n)

109. In Gaussian density function stands for _ _ _ _ _ _ & _ _ _ _ _ _ _ _<===>mean & SD

110. In how many approaches does tree pruning work?<===>2

111. In index-based algorithm, if k represents dimensionality and n represents number of objects in the data set. The worst-case complexity is _ _ _ _ _ _ _ _ _ _ _<===>O(k*)

112. In which operation Substring from pairs of rules are swapped to form new pair of rules<===>crossover

113. In Y= X, are _ _ _ _ _ _ _ _<===>regression coefficients

114. Interval-scaled variables are _ _ _ _ _ _ _ _ measurements of a linear scale.<===>continuous

115. Into how many independent sets the given data are randomly partitioned in holdout method?<===>2

116. is _ _ _ _ _ _ _ _ _ _ _<===>Jaccard coefficient

117. is _ _ _ _ _ _ _ _<===>simple matching coefficient

118. It is difficult to construct an object cube containing _ _ _ _ _ _ _ _ _ _ dimension<===>keyword

119. JEP is a special case of EP , where J stands for _ _ _ _ _ _ _ _ _ _ _ _ _ _<===>jumping

120. Let A[X] be the set of 'n' tuples t1,t2, _ _ _ _ _ _ _ _ tn projected on the attribute set X . Which measure of A[X] is the average pair wise distance between the tuples projected on X?<===>diameter

121. MBR stands for _ _ _ _ _ _ _ _ _ _ _ which represents 2 points for rough estimation of a merged region.<===>minimum bounding rectangle

122. Mining _ _ _ _ _ _ _ _ _ _ specifies the periodic behavior of the time series at some, but not all of the points in time.<===>partial periodic pattern.

123. Most of the partitioning methods cluster objects are based on _ _ _ _ _<===>distance between objects

124. Nave Bayesian classifier is also called as _ _ _ _ _ _ _ _ _ _ classifier.<===>simple Bayesian

125. Neural n/w learning is also referred to as _ _ _ _ _ _ _ _ _ learning<===>connectionist

126. Normalization fall within a range of _ _ _ _ _ _ _ _ _<===>-1.0 to +1.0

127. Object-by-object structure is also known as _ _ _ _ _ _<===>Dissimilarity matrix

128. Outlier detection and outlier analysis is a data mining task referred to as _ _ _<===>Outlier mining

129. P(H/X) is a _ _ _ _ _ _ _ _ probability.<===>posterior

130. Percent(A,"70,71 _ _ _ 80") => placement(A,"Infosys") percent(A,"70,71 _ _ _ 80") => placement(A, "Microsoft") percent(A,"70,71 _ _ _ _ 80") => placement(A," Dell") percent(A,"70,71 _ _ _ _ 80") => placement(A,"IBM") These set of rules clearly refer to _ _ _ _ _ _ _ _ _ _ _ _ _ _ rule<===>Multi dimensional association

131. Preprocessing of data in preparation for classification and prediction can involve ------ for normalizing the data.<===>data transformation

132. Rough<===>equivalence

133. Sensitivity & specificity can be used to measure _ _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _ _<===>+ve samples & -ve samples

134. SOM stands for _ _ _ _ _ _ _ _ _ _<===>self organizing maps

135. Sp = calculates _ _ _ _ _ _ _ _ _ _<===>mean absolute deviation

136. Square wave influence function is set to 0(zero) if _ _<===>d(x,y)> σ

137. STING is _ _ _ _ _ _ _ _ clustering algorithm<===>grid-based methods

138. The _ _ _ _ _ _ _ _ _ _ is an object is defined as the sum of influence functions of all data points<===>density function

139. The _ _ _ _ _ _ _ _ states that discordant values are not outliers in distribution F, but contaminants from some other distribution G.<===>mixture alternative distribution

140. The _ _ _ _ _ _ algorithm where each cluster is represented byy one of the objects located near the center of cluster.<===>k-medoids

141. The absolute closeness between 2 clusters, normalized w.r.t the internal closeness of two clusters is _ _<===>relative closeness

142. The agglomerative approach is also called as _ _ _ _ _ approach.<===>bottom-up

143. The algorithm of "Training Bayesian Belief Networks" involve which sequence of steps i)compute the gradients ii)renormalize the weights iii)update the weights<===>i, iii, ii

144. The basic algorithm for decision tree induction is _ _ _ _ _ _ _<===>greedy algorithm

145. the computational complexity of CLARANS is _ _ _ _ _ _ _ _ _ _<===>O(n2)

146. The constraint "max(I.marks)<=600" is acceptable by _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _ categories.<===>antimonotone,succinct

147. The constraint "support(s) is acceptable by _ _ _ _ _ _ _ _ _ _ category.<===>monotone

148. The correlation between the occurrence of A and B can be measured by computing _ _<===>corrA,B = P(A∪B) / (P(A)P(B))u

149. The data matrix is often called as _ _ _ _ _ _ _<===>two-mode matrix

150. The data tuples analyzed to build the model collectively form _ _ _<===>training data set

151. The following rule Age(A,"20,21 _ _ _ _ 27") Λ percent(A,"60,61 _ _ _ 80") Λ test(A,"B,B+ _ _ _ _ .A+) =>placement(A,"MNCs") is an example of _ _ _ _ _ _ _ _<===>Multi dimensional association

152. The larger the value, the greater the proportion of class members that share attribute-value pair This is applicable t o _ _ _ _ _ _ _ _ _ _ _ _ _<===>intraclass similarity

153. The partitioning of process is referred to as binning and the intervals are considered as _ _ _<===>bins

154. The process of grouping a set of physical objects into classes of similar objects is called as –<===>clusterin

155. The rule "IF NOT A1 AND A2 THEN NOT C1 "is encoded as<===>010

156. The sequence of steps in conceptual clustering are _ _ _ _ _ _ _ _ _ _<===>clustering,characterization

157. The set X=>Y [a=50 %,b=2 %] is a _ _ _ _ _ _ _ _ _ _ _ item set.<===>2

158. The statement "If any student age is above 20 and their percentage is 70 or more are approved for placement " the rule in rough set theory can be written as _ _ _ _ _ _<===>if(x, age > 20) Λ if(x, percentage >= 70) then placement =approved

159. The tech of updating the weights & biases after the presentation of each sample is<===>case updating

160. TID stands for _ _ _ _ _ _ _ _ _ _ _<===>Transaction is associated with an identifier

161. Time series analysis is also referred to as _ _ _ _ _ _ _ _<===>decomposition

162. Training set & test set are two sets of _ _ _ _ _ _ _ _ _ _ _<===>holdout method

163. Transaction reduction implies _ _ _ _ _ _<===>reducing the number of transactions in the future iteration

164. Wave Cluster is _ _ _ _ _ _ _ _ algorithm from the following. i)hierarchical ii)density-basediii)grid-based<===>ii & iii

165. Web linkage structures, web contents etc., are included in _ _ _ _ _ _ _ _ _<===>web mining

166. When multi-level association rules are mined, some of the rules found will be redundant due to _ _ _ _ _ _ _ _ _ _ _ relationships between them.<===>ancestor

167. where m and p are _ _ _ _ _ _ _ _ _ _ _ _<===>number of matches and number of variables

168. Which algorithm facilitates parallel processing?<===>STING

169. Which algorithm is included with certain series of walks through itemset space?<===>random walk through algorithm

170. Which association rule has overcome the disadvantage of Association rules?<===>Distance-based association rules

171. Which categories can be used during association mining to guide the process, leading to more efficient and effective mining?<===>antimonotone, monotone, succinct, convertible

172. Which constraint specify the set of task-relevant data?<===>data constraints

173. Which constraints are applied before mining?<===>knowledge type and data constraints

174. Which constraints may be expressed as Metarules?<===>rule constraints

175. Which method doesn't handle categorical attributes?<===>CURE

176. Which method of estimating classifier samples the given the training instances uniformly with replacement?<===>boot strapping

177. Which method overcame with the problem of favoring clusters with spherical shape and similar sizes?<===>CURE

178. Which method represents each cluster by a certain fixed number of representative objects and shrinks them towards the center of the cluster<===>CURE

179. Which of the following are spatial operators? i)spatial-union ii)spatial-overlapping iii)spatial-intersection iv)spatial-disjoint<===>i,ii,iii

180. Which of the following criteria is not the one used for the comparison of classification and prediction?<===>data cleaning

181. Which regression helps in counting the data frequently?<===>Poisson regression

182. Which step could involve huge computations?<===>prune step

183. Which threshold can be set up for passing down relatively frequent items to lower<===>level-class threshold

184. While _ _ _ _ _ _ _ predicts class, _ _ _ _ _ _ _ models continuous-valued functions.<===>classification-prediction

2. _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _are the techniques for approving overall classifier accuracy by learning and combining series of individual classifiers.<===>bagging & boosting

3. _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ are the two major types of prediction problems.<===>classification-regression

4. _ _ _ _ _ _ _ _ _ _ _ _ is a signal processing technique that decomposes a signal into different frequency sub bands.<===>Wavelet transform

5. _ _ _ _ _ _ _ _ _ _ _ are modified so as to minimize the mean squared error b/w the networks prediction and the actual class<===>weights of the nodes

6. _ _ _ _ _ _ _ _ _ _ _ is a two step process<===>data classification

7. _ _ _ _ _ _ _ _ _ _ _ methods use statistical measures to remove the least reliable<===>tree pruning

8. _ _ _ _ _ _ _ _ _ _ _ performs multidimensional clustering in 2 steps.<===>CLIQUE

9. _ _ _ _ _ _ _ _ _ _ _ verifies whether an object is significantly large or small in relation to the distribution F.<===>discordancy test

10. _ _ _ _ _ _ _ _ _ _ is a collection of pointers to spatial objects.<===>spatial measures

11. _ _ _ _ _ _ _ _ _ _ is used to identify outliers w.r.t the model.<===>discordancy test

12. _ _ _ _ _ _ _ _ _ _ methods quantize the object space into a finite number of cells that form a grid structure.<===>grid-based methods

13. _ _ _ _ _ _ _ _ _ _ notation be used to represent sequence of actions of the same type.<===>+

14. _ _ _ _ _ _ _ _ _ _ uses the concept to generalize the data by replacing lower-level data with high-level concepts.<===>attribute oriented induction

15. _ _ _ _ _ _ _ _ _ and _ _ _ _ _ _ _ _ _ are two operators of COBWEB.<===>merging, splitting

16. _ _ _ _ _ _ _ _ _ can be used to find the most "natural" number of clusters using a silhouette coefficient.<===>CLARANS

17. _ _ _ _ _ _ _ _ _ doesn't require a metric distance between the objects.<===>dissimilarity f unction

18. _ _ _ _ _ _ _ _ _ find all pairs of gap free windows of a small length that are similar.<===>atomic matching

19. _ _ _ _ _ _ _ _ _ is a dimension whose primitive level data are spatial starting at a certain level, becomes non-spatial.<===>spatial to non-spatial dimensions

20. _ _ _ _ _ _ _ _ _ is a task of mining significant patterns from a plan base.<===>plan mining

21. _ _ _ _ _ _ _ _ _ is used to access the percentage of samples.<===>precision

22. _ _ _ _ _ _ _ _ _ is used to improve the efficient of the Apriori algorithm.<===>Iceberg queries

23. _ _ _ _ _ _ _ _ _ refers to the preprocessing of data in order to remove noise.<===>data cleaning

24. _ _ _ _ _ _ _ _ _ stores and manages a large collection of multimedia objects.<===>multimedia database system

25. _ _ _ _ _ _ _ _ and _ _ _ _ _ _ _ _ _ are important means of generalization .<===>aggregation, approximation

26. _ _ _ _ _ _ _ _ can be modeled by adding polynomial terms to the basic linear mode.<===>polynomial regression

27. _ _ _ _ _ _ _ _ is a optimization method for spatial association analysis.<===>progressive refinement

28. _ _ _ _ _ _ _ _ is one where properties can be inherited from more than one super class.<===>multiple inheritance

29. _ _ _ _ _ _ _ _ is the database that consists of sequence of ordered events with / without concrete notions of time.<===>sequence database

30. _ _ _ _ _ _ _ _ refer to the cycles which are long term oscillations about a trend line/curve which may/may not be periodic.<===>cyclic movements

31. _ _ _ _ _ _ _ _ regression can be modeled by adding polynomial terms to the basic linear model<===>polynomial

32. _ _ _ _ _ _ _ can automatically result in the removal of outliers.<===>Wavelet transform

33. _ _ _ _ _ _ _ compete in a "winner-take-all" fashion for the object that is currently presented to the system<===>competitive learning

34. _ _ _ _ _ _ _ has the ability to automatically adjust the number of classes in a partition.<===>COBWEB

35. _ _ _ _ _ _ _ is a density-based method that computers an augmented clustering.<===>OPTICS

36. _ _ _ _ _ _ _ is a simple technique that uses a test set of class-labeled samples.<===>holdout method

37. _ _ _ _ _ are used to incorporate ideas if natural evolution<===>genetic algorithms

38. _ _ _ _ if(x, age > 20) Λ if(x, percentage >= 70) then placement = approved _ _ _ _ _ _ _ is defined in terms of Euclidean distance<===>Closeness between 2 points

39. _ _ _ _ refers to the extraction of knowledge spatial relationships not explicitly stored in spatial databases.<===>spatial data mining

40. A _ _ _ _ _ _ _ _ _ _ has complex tasks, graphics, images, videos, maps, voice, music etc.<===>multimedia database

41. A _ _ _ _ _ _ _ _ _ resembles a nominal variable.<===>discrete ordinal variable

42. A constraint "max(I.marks) >=600" is acceptable for _ _ _ _ _ _ & _ _ _ _ _ _ _categories.<===>monotone,succinct

43. A constraint such as "avg(I.marks) <= 70" is not a(n) _ _ _ _ _ _ _ _ _ _ _<===>anti-monotone

44. A huge amount of space-related data are in _ _ _ _ _ _ _ _ _ forms.<===>images

45. A Neural Network containing N hidden Layers is called as _ _ _ _ _ _ _ _ _ _ Neural network layered<===>(N+1)

46. A set of items is referred to as a(n) _ _ _ _ _ _ _ _ _ _<===>itemset

47. A set-valued attribute may be _ _ _ _ _ _ _ i) homogenous ii)heterogeneous<===>i or ii

48. Accuracy estimates also help in _ _ _ _ _ _ _ _ _<===>comparison of different classifiers

49. Accuracy is given as _ _ _ _ _ _ _ _ _ _<===>specificity * (neg/(pos+neg)) + sensitivity *(pos/(pos+neg))

50. AGNES is expanded to _ _ _ _ _ _ _ _<===>agglomerative nesting

51. Anti-monotone states _ _ _ _ _ _ _ _<===>if a set cannot pass a test, all its supersets also cannot pass the same test

52. Anti-monotone, monotone, succinct, convertible and inconvertible are five different categories of _ _ _ _ _ _ _ _ _ _ constraints.<===>rule constraints

53. Apart from prediction, the log linier model is also useful for _ _ _ _ _<===>data compression

54. Apriori algorithm employs level-wise search, where k-itemsets uses ------ itemsets.<===>(k+1)

55. Assume the fallow data X(years experience ) y( salary) 02 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 13000 09 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 45000 15 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 75000 03 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 18000 Its β value is _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<===>4698.6

56. Assume the following salary details: X(years experience) Y(salary in Rs.)2 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 13000 9 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 45000 15 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 75000 4 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 18000 The value of x = _ _ _ _ _ _ _ _ _<===>8.5

57. Back propagation is a neural n/w _ _ _ _ _ _ _ _ _ _ _ algorithm<===>learning

58. Bayes theorem provides a way of calculating which probability<===>posterior

59. Block and consecutive procedures are 2 basic types for _ _ _ _ _ _ _ _ _ _ _<===>detecting outliers

60. Bootstrap is also known as _ _ _ _ _ _ _ _ _<===>bagging

61. car => financial from bank [loan=80 %, insurance=20 %] _ _ _ _ _ _ _ & _ _ _ _ _ are two measures of Association rules.<===>loan,insurance

62. car =>financial from bank [loan=80 %,insuranace=20 %] Association rules are satisfied if they have _ _<===>maximum loan threshold, minimum insurance threshold

63. CF tree has how many parameters?<===>2

64. Classification threshold is also called as _ _ _ _ _ _ _ _ _<===>precision threshold

65. CLASSIT and Auto class are _ _ _ _ _ _ _ _ _ _<===>statistical approaches

66. Clustering is a form of _ _ _ _ _ _ _ _ _ _ _<===>learning by observation

67. Clustering large applications can be shortened as _ _ _ _ _<===>CLARA

68. Consider a spatial association rule Is-a(X,"office") Λ near-by(X,"house") => near-by(X,"university") [ 0.5 % 80 %] The rule states _ _ _ _ _ _ _ _ percent of the offices are close to houses.<===>80 %

69. Consider the data Original data: 3 7 2 0 7 2 Moving average of order 3: 4 3 2 6 Weighted(3,4,3) The first weighted average value is _ _ _ _ _ _ _ _ _ _ _<===>4.3

70. Consider the following rile Age(A,"18,19, _ _ _ _ _ 29") Λ placement(A,"Infosys,IBM, _ _ _ _ ") Λ purchases(A,"mobile") => purchases(A,"high memory card") Λ purchases(A,"card reader") This rule is highlighted in saying that it has _ _ _ _ _ _ _ _<===>repetitive predicate

71. Consider the following rule: If an engineering student inWarangal bought "speech recognition CD" and "MS Office" and "jdk 1.7", it is likely (with a probability of 58 %) that the student also bought SQL Server and "My SQL Server" and 6.5 % of all the students bought all five. The meta rule can be generated in association rule as _ _ _ _<===>lives(S, _,Warangal") Λ sales(S,"speech recognition", _) Λ sales(S,"MS Office", _) Λ sales(S,"jdk 1.7", _) =>sales(S,"SQL Server", _) Λ sales(S,"My SQL Server", _) [6.5 % 58 %]

72. CPT stands for _ _ _ _ _ _ _ _ _ _ _<===>Conditional probability table

73. Data which are inconsistent with the remaining set of data is called as _ _ _ _<===>Outliers

74. DBSCAN is _ _ _ _ _ _ _ _ _ clustering algorithm<===>density-based methods

75. Decision trees can easily be converted to _ _ _ _ _ _ _ rules.<===>If-THEN

76. Density connectivity is a _ _ _ _ _ _ _ _ relation.<===>symmetric

77. DFT and DWT are two popular data-independent transformations where F stands for _ _ _ _ _ _ _ _ _ _ _ _<===>Fourier

78. During Bayesian n/w s incomplete data is referred to _ _ _ _ _ _ _ _<===>Hidden data

79. During the construction of decision tree induction the tree starts as _ _ _ _ _ _<===>single node

80. Each attribute to simple-valued data for constructing a multi-dimensional data cube are called as _ _ _ _ _ _ _ _ _ _ _<===>object cube

81. Each object in a class is associated with _ _ _ _ _ _ _ _ _ _ _<===>object identifier & set of attributes

82. Early decision tree algorithms typically assume that the data is from _ _ _ _<===>memory

83. EM is expanded to _ _ _ _ _ _<===>expectation maximization

84. EP stands for _ _ _ _ _ _ _ _ _ _ _<===>emerging patterns

85. Erri = where are _ _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _<===>actual o/p, true o/p

86. For given n objects the complexity of CURE is _ _ _ _ _ _ _ _ _ _<===>O(n)

87. From the eg. Y= | x, the co-efficient a= ______________<===>y'-βx

88. Gaussian influence function is given as _ _ _ _ _ _ _ _ _ _ _<===>e-

89. Grid-based computation is _ _ _ _ _ _ _ _ _ _ _<===>query independent

90. If a k-dimensional unit is dense , then its projections are in _ _ _ _ _ _ _dimensional space.<===>(k-1)

91. If a rule concerns associations between the presence or absence of items, it is a _ _<===>Boolean association

92. If a rule describes association between quantitative attributes, it is a _ _ _ _ _ _ _ _ _<===>Quantitative

93. If a set of rules has same condent,then the rule with highest confidence is selected as _ _ _ to represent the set<===>possible rule

94. If a single distinct predicate exists in single dimensional association rule , it is also called as _ _ _<===>intra dimension association rule

95. If an arc is drawn from node A to node B then A is _ _ _ _ _ _ _ _ _ of B i)Parent ii)immediate predecessor iii)descendent iv)immediate successor<===>i & ii

96. If in multi dimensional association rule with repeated predicates, which contains multiple occurrences of some predicate certain rules are called as _ _ _ _ _ _ _ _ _<===>hybrid association rule

97. If in the item set {percent <= "59", placement ="no" } whose support increases from 0.7 % is c1 to 92.6 % in c2, the growth rate is _ _ _ _ _ _<===>92.6 %/0.7 %

98. If no repeated predicates exists in multi dimensional association rule is also called as<===>inter dimension association rule

99. If the time interval "int=0" means _ _ _ _ _ _ _ _ _ _<===>no interval gap is allowed

100. If the transactional data is The minimum support count is _ _ _ _ _ _<===>2

101. If there are 'm' number of objects within d-neighborhood of an outlier and later it is decided as not an outlier because of its _ _ _ _ _ _ _ _ _ number of neighbors.<===>(m+1)

102. If X=>Y[a=50 %,b=2 %] a(X=>Y) = _ _ _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _ and<===>P(AUB),P(B/A)

103. In _ _ _ _ _ _ _ _ _ signature, its image includes a composition of multiple features.<===>multi feature composed

104. In _ _ _ _ _ _ _ _ _ the class label of each training sample is not known, and the number or set of classes to be learned may not be known in advance.<===>unsupervised learning

105. In _ _ _ _ _ _ _ _ the class distribution of the samples in each fold is approximately the same as that in the initial data.<===>stratified cross-validation

106. In a decision tree, _ _ _ _ _ _ _ _ represents an outcome of the test.<===>branch

107. In a multilayer feed-forward NN the weighted output of hidden layer are inputs to _<===>output layer

108. In cell-based algorithm, if k represents dimensionality and c is a constant. Its complexity is defined as _ _ _ _ _ _ _ _ _ _<===>O(+n)

109. In Gaussian density function stands for _ _ _ _ _ _ & _ _ _ _ _ _ _ _<===>mean & SD

110. In how many approaches does tree pruning work?<===>2

111. In index-based algorithm, if k represents dimensionality and n represents number of objects in the data set. The worst-case complexity is _ _ _ _ _ _ _ _ _ _ _<===>O(k*)

112. In which operation Substring from pairs of rules are swapped to form new pair of rules<===>crossover

113. In Y= X, are _ _ _ _ _ _ _ _<===>regression coefficients

114. Interval-scaled variables are _ _ _ _ _ _ _ _ measurements of a linear scale.<===>continuous

115. Into how many independent sets the given data are randomly partitioned in holdout method?<===>2

116. is _ _ _ _ _ _ _ _ _ _ _<===>Jaccard coefficient

117. is _ _ _ _ _ _ _ _<===>simple matching coefficient

118. It is difficult to construct an object cube containing _ _ _ _ _ _ _ _ _ _ dimension<===>keyword

119. JEP is a special case of EP , where J stands for _ _ _ _ _ _ _ _ _ _ _ _ _ _<===>jumping

120. Let A[X] be the set of 'n' tuples t1,t2, _ _ _ _ _ _ _ _ tn projected on the attribute set X . Which measure of A[X] is the average pair wise distance between the tuples projected on X?<===>diameter

121. MBR stands for _ _ _ _ _ _ _ _ _ _ _ which represents 2 points for rough estimation of a merged region.<===>minimum bounding rectangle

122. Mining _ _ _ _ _ _ _ _ _ _ specifies the periodic behavior of the time series at some, but not all of the points in time.<===>partial periodic pattern.

123. Most of the partitioning methods cluster objects are based on _ _ _ _ _<===>distance between objects

124. Nave Bayesian classifier is also called as _ _ _ _ _ _ _ _ _ _ classifier.<===>simple Bayesian

125. Neural n/w learning is also referred to as _ _ _ _ _ _ _ _ _ learning<===>connectionist

126. Normalization fall within a range of _ _ _ _ _ _ _ _ _<===>-1.0 to +1.0

127. Object-by-object structure is also known as _ _ _ _ _ _<===>Dissimilarity matrix

128. Outlier detection and outlier analysis is a data mining task referred to as _ _ _<===>Outlier mining

129. P(H/X) is a _ _ _ _ _ _ _ _ probability.<===>posterior

130. Percent(A,"70,71 _ _ _ 80") => placement(A,"Infosys") percent(A,"70,71 _ _ _ 80") => placement(A, "Microsoft") percent(A,"70,71 _ _ _ _ 80") => placement(A," Dell") percent(A,"70,71 _ _ _ _ 80") => placement(A,"IBM") These set of rules clearly refer to _ _ _ _ _ _ _ _ _ _ _ _ _ _ rule<===>Multi dimensional association

131. Preprocessing of data in preparation for classification and prediction can involve ------ for normalizing the data.<===>data transformation

132. Rough<===>equivalence

133. Sensitivity & specificity can be used to measure _ _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _ _<===>+ve samples & -ve samples

134. SOM stands for _ _ _ _ _ _ _ _ _ _<===>self organizing maps

135. Sp = calculates _ _ _ _ _ _ _ _ _ _<===>mean absolute deviation

136. Square wave influence function is set to 0(zero) if _ _<===>d(x,y)> σ

137. STING is _ _ _ _ _ _ _ _ clustering algorithm<===>grid-based methods

138. The _ _ _ _ _ _ _ _ _ _ is an object is defined as the sum of influence functions of all data points<===>density function

139. The _ _ _ _ _ _ _ _ states that discordant values are not outliers in distribution F, but contaminants from some other distribution G.<===>mixture alternative distribution

140. The _ _ _ _ _ _ algorithm where each cluster is represented byy one of the objects located near the center of cluster.<===>k-medoids

141. The absolute closeness between 2 clusters, normalized w.r.t the internal closeness of two clusters is _ _<===>relative closeness

142. The agglomerative approach is also called as _ _ _ _ _ approach.<===>bottom-up

143. The algorithm of "Training Bayesian Belief Networks" involve which sequence of steps i)compute the gradients ii)renormalize the weights iii)update the weights<===>i, iii, ii

144. The basic algorithm for decision tree induction is _ _ _ _ _ _ _<===>greedy algorithm

145. the computational complexity of CLARANS is _ _ _ _ _ _ _ _ _ _<===>O(n2)

146. The constraint "max(I.marks)<=600" is acceptable by _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _ categories.<===>antimonotone,succinct

147. The constraint "support(s) is acceptable by _ _ _ _ _ _ _ _ _ _ category.<===>monotone

148. The correlation between the occurrence of A and B can be measured by computing _ _<===>corrA,B = P(A∪B) / (P(A)P(B))u

149. The data matrix is often called as _ _ _ _ _ _ _<===>two-mode matrix

150. The data tuples analyzed to build the model collectively form _ _ _<===>training data set

151. The following rule Age(A,"20,21 _ _ _ _ 27") Λ percent(A,"60,61 _ _ _ 80") Λ test(A,"B,B+ _ _ _ _ .A+) =>placement(A,"MNCs") is an example of _ _ _ _ _ _ _ _<===>Multi dimensional association

152. The larger the value, the greater the proportion of class members that share attribute-value pair This is applicable t o _ _ _ _ _ _ _ _ _ _ _ _ _<===>intraclass similarity

153. The partitioning of process is referred to as binning and the intervals are considered as _ _ _<===>bins

154. The process of grouping a set of physical objects into classes of similar objects is called as –<===>clusterin

155. The rule "IF NOT A1 AND A2 THEN NOT C1 "is encoded as<===>010

156. The sequence of steps in conceptual clustering are _ _ _ _ _ _ _ _ _ _<===>clustering,characterization

157. The set X=>Y [a=50 %,b=2 %] is a _ _ _ _ _ _ _ _ _ _ _ item set.<===>2

158. The statement "If any student age is above 20 and their percentage is 70 or more are approved for placement " the rule in rough set theory can be written as _ _ _ _ _ _<===>if(x, age > 20) Λ if(x, percentage >= 70) then placement =approved

159. The tech of updating the weights & biases after the presentation of each sample is<===>case updating

160. TID stands for _ _ _ _ _ _ _ _ _ _ _<===>Transaction is associated with an identifier

161. Time series analysis is also referred to as _ _ _ _ _ _ _ _<===>decomposition

162. Training set & test set are two sets of _ _ _ _ _ _ _ _ _ _ _<===>holdout method

163. Transaction reduction implies _ _ _ _ _ _<===>reducing the number of transactions in the future iteration

164. Wave Cluster is _ _ _ _ _ _ _ _ algorithm from the following. i)hierarchical ii)density-basediii)grid-based<===>ii & iii

165. Web linkage structures, web contents etc., are included in _ _ _ _ _ _ _ _ _<===>web mining

166. When multi-level association rules are mined, some of the rules found will be redundant due to _ _ _ _ _ _ _ _ _ _ _ relationships between them.<===>ancestor

167. where m and p are _ _ _ _ _ _ _ _ _ _ _ _<===>number of matches and number of variables

168. Which algorithm facilitates parallel processing?<===>STING

169. Which algorithm is included with certain series of walks through itemset space?<===>random walk through algorithm

170. Which association rule has overcome the disadvantage of Association rules?<===>Distance-based association rules

171. Which categories can be used during association mining to guide the process, leading to more efficient and effective mining?<===>antimonotone, monotone, succinct, convertible

172. Which constraint specify the set of task-relevant data?<===>data constraints

173. Which constraints are applied before mining?<===>knowledge type and data constraints

174. Which constraints may be expressed as Metarules?<===>rule constraints

175. Which method doesn't handle categorical attributes?<===>CURE

176. Which method of estimating classifier samples the given the training instances uniformly with replacement?<===>boot strapping

177. Which method overcame with the problem of favoring clusters with spherical shape and similar sizes?<===>CURE

178. Which method represents each cluster by a certain fixed number of representative objects and shrinks them towards the center of the cluster<===>CURE

179. Which of the following are spatial operators? i)spatial-union ii)spatial-overlapping iii)spatial-intersection iv)spatial-disjoint<===>i,ii,iii

180. Which of the following criteria is not the one used for the comparison of classification and prediction?<===>data cleaning

181. Which regression helps in counting the data frequently?<===>Poisson regression

182. Which step could involve huge computations?<===>prune step

183. Which threshold can be set up for passing down relatively frequent items to lower<===>level-class threshold

184. While _ _ _ _ _ _ _ predicts class, _ _ _ _ _ _ _ models continuous-valued functions.<===>classification-prediction

## No comments:

## Post a Comment