1. -------- is a triplet summary information about sub clusters of objects<===>clustering features
2. _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _are the techniques for approving overall classifier accuracy by learning and combining series of individual classifiers.<===>bagging & boosting
3. _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ are the two major types of prediction problems.<===>classification-regression
4. _ _ _ _ _ _ _ _ _ _ _ _ is a signal processing technique that decomposes a signal into different frequency sub bands.<===>Wavelet transform
5. _ _ _ _ _ _ _ _ _ _ _ are modified so as to minimize the mean squared error b/w the networks prediction and the actual class<===>weights of the nodes
6. _ _ _ _ _ _ _ _ _ _ _ is a two step process<===>data classification
7. _ _ _ _ _ _ _ _ _ _ _ methods use statistical measures to remove the least reliable<===>tree pruning
8. _ _ _ _ _ _ _ _ _ _ _ performs multidimensional clustering in 2 steps.<===>CLIQUE
9. _ _ _ _ _ _ _ _ _ _ _ verifies whether an object is significantly large or small in relation to the distribution F.<===>discordancy test
10. _ _ _ _ _ _ _ _ _ _ is a collection of pointers to spatial objects.<===>spatial measures
11. _ _ _ _ _ _ _ _ _ _ is used to identify outliers w.r.t the model.<===>discordancy test
12. _ _ _ _ _ _ _ _ _ _ methods quantize the object space into a finite number of cells that form a grid structure.<===>grid-based methods
13. _ _ _ _ _ _ _ _ _ _ notation be used to represent sequence of actions of the same type.<===>+
14. _ _ _ _ _ _ _ _ _ _ uses the concept to generalize the data by replacing lower-level data with high-level concepts.<===>attribute oriented induction
15. _ _ _ _ _ _ _ _ _ and _ _ _ _ _ _ _ _ _ are two operators of COBWEB.<===>merging, splitting
16. _ _ _ _ _ _ _ _ _ can be used to find the most "natural" number of clusters using a silhouette coefficient.<===>CLARANS
17. _ _ _ _ _ _ _ _ _ doesn't require a metric distance between the objects.<===>dissimilarity f unction
18. _ _ _ _ _ _ _ _ _ find all pairs of gap free windows of a small length that are similar.<===>atomic matching
19. _ _ _ _ _ _ _ _ _ is a dimension whose primitive level data are spatial starting at a certain level, becomes non-spatial.<===>spatial to non-spatial dimensions
20. _ _ _ _ _ _ _ _ _ is a task of mining significant patterns from a plan base.<===>plan mining
21. _ _ _ _ _ _ _ _ _ is used to access the percentage of samples.<===>precision
22. _ _ _ _ _ _ _ _ _ is used to improve the efficient of the Apriori algorithm.<===>Iceberg queries
23. _ _ _ _ _ _ _ _ _ refers to the preprocessing of data in order to remove noise.<===>data cleaning
24. _ _ _ _ _ _ _ _ _ stores and manages a large collection of multimedia objects.<===>multimedia database system
25. _ _ _ _ _ _ _ _ and _ _ _ _ _ _ _ _ _ are important means of generalization .<===>aggregation, approximation
26. _ _ _ _ _ _ _ _ can be modeled by adding polynomial terms to the basic linear mode.<===>polynomial regression
27. _ _ _ _ _ _ _ _ is a optimization method for spatial association analysis.<===>progressive refinement
28. _ _ _ _ _ _ _ _ is one where properties can be inherited from more than one super class.<===>multiple inheritance
29. _ _ _ _ _ _ _ _ is the database that consists of sequence of ordered events with / without concrete notions of time.<===>sequence database
30. _ _ _ _ _ _ _ _ refer to the cycles which are long term oscillations about a trend line/curve which may/may not be periodic.<===>cyclic movements
31. _ _ _ _ _ _ _ _ regression can be modeled by adding polynomial terms to the basic linear model<===>polynomial
32. _ _ _ _ _ _ _ can automatically result in the removal of outliers.<===>Wavelet transform
33. _ _ _ _ _ _ _ compete in a "winner-take-all" fashion for the object that is currently presented to the system<===>competitive learning
34. _ _ _ _ _ _ _ has the ability to automatically adjust the number of classes in a partition.<===>COBWEB
35. _ _ _ _ _ _ _ is a density-based method that computers an augmented clustering.<===>OPTICS
36. _ _ _ _ _ _ _ is a simple technique that uses a test set of class-labeled samples.<===>holdout method
37. _ _ _ _ _ are used to incorporate ideas if natural evolution<===>genetic algorithms
38. _ _ _ _ if(x, age > 20) Λ if(x, percentage >= 70) then placement = approved _ _ _ _ _ _ _ is defined in terms of Euclidean distance<===>Closeness between 2 points
39. _ _ _ _ refers to the extraction of knowledge spatial relationships not explicitly stored in spatial databases.<===>spatial data mining
40. A _ _ _ _ _ _ _ _ _ _ has complex tasks, graphics, images, videos, maps, voice, music etc.<===>multimedia database
41. A _ _ _ _ _ _ _ _ _ resembles a nominal variable.<===>discrete ordinal variable
42. A constraint "max(I.marks) >=600" is acceptable for _ _ _ _ _ _ & _ _ _ _ _ _ _categories.<===>monotone,succinct
43. A constraint such as "avg(I.marks) <= 70" is not a(n) _ _ _ _ _ _ _ _ _ _ _<===>anti-monotone
44. A huge amount of space-related data are in _ _ _ _ _ _ _ _ _ forms.<===>images
45. A Neural Network containing N hidden Layers is called as _ _ _ _ _ _ _ _ _ _ Neural network layered<===>(N+1)
46. A set of items is referred to as a(n) _ _ _ _ _ _ _ _ _ _<===>itemset
47. A set-valued attribute may be _ _ _ _ _ _ _ i) homogenous ii)heterogeneous<===>i or ii
48. Accuracy estimates also help in _ _ _ _ _ _ _ _ _<===>comparison of different classifiers
49. Accuracy is given as _ _ _ _ _ _ _ _ _ _<===>specificity * (neg/(pos+neg)) + sensitivity *(pos/(pos+neg))
50. AGNES is expanded to _ _ _ _ _ _ _ _<===>agglomerative nesting
51. Anti-monotone states _ _ _ _ _ _ _ _<===>if a set cannot pass a test, all its supersets also cannot pass the same test
52. Anti-monotone, monotone, succinct, convertible and inconvertible are five different categories of _ _ _ _ _ _ _ _ _ _ constraints.<===>rule constraints
53. Apart from prediction, the log linier model is also useful for _ _ _ _ _<===>data compression
54. Apriori algorithm employs level-wise search, where k-itemsets uses ------ itemsets.<===>(k+1)
55. Assume the fallow data X(years experience ) y( salary) 02 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 13000 09 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 45000 15 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 75000 03 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 18000 Its β value is _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<===>4698.6
56. Assume the following salary details: X(years experience) Y(salary in Rs.)2 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 13000 9 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 45000 15 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 75000 4 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 18000 The value of x = _ _ _ _ _ _ _ _ _<===>8.5
57. Back propagation is a neural n/w _ _ _ _ _ _ _ _ _ _ _ algorithm<===>learning
58. Bayes theorem provides a way of calculating which probability<===>posterior
59. Block and consecutive procedures are 2 basic types for _ _ _ _ _ _ _ _ _ _ _<===>detecting outliers
60. Bootstrap is also known as _ _ _ _ _ _ _ _ _<===>bagging
61. car => financial from bank [loan=80 %, insurance=20 %] _ _ _ _ _ _ _ & _ _ _ _ _ are two measures of Association rules.<===>loan,insurance
62. car =>financial from bank [loan=80 %,insuranace=20 %] Association rules are satisfied if they have _ _<===>maximum loan threshold, minimum insurance threshold
63. CF tree has how many parameters?<===>2
64. Classification threshold is also called as _ _ _ _ _ _ _ _ _<===>precision threshold
65. CLASSIT and Auto class are _ _ _ _ _ _ _ _ _ _<===>statistical approaches
66. Clustering is a form of _ _ _ _ _ _ _ _ _ _ _<===>learning by observation
67. Clustering large applications can be shortened as _ _ _ _ _<===>CLARA
68. Consider a spatial association rule Is-a(X,"office") Λ near-by(X,"house") => near-by(X,"university") [ 0.5 % 80 %] The rule states _ _ _ _ _ _ _ _ percent of the offices are close to houses.<===>80 %
69. Consider the data Original data: 3 7 2 0 7 2 Moving average of order 3: 4 3 2 6 Weighted(3,4,3) The first weighted average value is _ _ _ _ _ _ _ _ _ _ _<===>4.3
70. Consider the following rile Age(A,"18,19, _ _ _ _ _ 29") Λ placement(A,"Infosys,IBM, _ _ _ _ ") Λ purchases(A,"mobile") => purchases(A,"high memory card") Λ purchases(A,"card reader") This rule is highlighted in saying that it has _ _ _ _ _ _ _ _<===>repetitive predicate
71. Consider the following rule: If an engineering student inWarangal bought "speech recognition CD" and "MS Office" and "jdk 1.7", it is likely (with a probability of 58 %) that the student also bought SQL Server and "My SQL Server" and 6.5 % of all the students bought all five. The meta rule can be generated in association rule as _ _ _ _<===>lives(S, _,Warangal") Λ sales(S,"speech recognition", _) Λ sales(S,"MS Office", _) Λ sales(S,"jdk 1.7", _) =>sales(S,"SQL Server", _) Λ sales(S,"My SQL Server", _) [6.5 % 58 %]
72. CPT stands for _ _ _ _ _ _ _ _ _ _ _<===>Conditional probability table
73. Data which are inconsistent with the remaining set of data is called as _ _ _ _<===>Outliers
74. DBSCAN is _ _ _ _ _ _ _ _ _ clustering algorithm<===>density-based methods
75. Decision trees can easily be converted to _ _ _ _ _ _ _ rules.<===>If-THEN
76. Density connectivity is a _ _ _ _ _ _ _ _ relation.<===>symmetric
77. DFT and DWT are two popular data-independent transformations where F stands for _ _ _ _ _ _ _ _ _ _ _ _<===>Fourier
78. During Bayesian n/w s incomplete data is referred to _ _ _ _ _ _ _ _<===>Hidden data
79. During the construction of decision tree induction the tree starts as _ _ _ _ _ _<===>single node
80. Each attribute to simple-valued data for constructing a multi-dimensional data cube are called as _ _ _ _ _ _ _ _ _ _ _<===>object cube
81. Each object in a class is associated with _ _ _ _ _ _ _ _ _ _ _<===>object identifier & set of attributes
82. Early decision tree algorithms typically assume that the data is from _ _ _ _<===>memory
83. EM is expanded to _ _ _ _ _ _<===>expectation maximization
84. EP stands for _ _ _ _ _ _ _ _ _ _ _<===>emerging patterns
85. Erri = where are _ _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _<===>actual o/p, true o/p
86. For given n objects the complexity of CURE is _ _ _ _ _ _ _ _ _ _<===>O(n)
87. From the eg. Y= | x, the co-efficient a= ______________<===>y'-βx
88. Gaussian influence function is given as _ _ _ _ _ _ _ _ _ _ _<===>e-
89. Grid-based computation is _ _ _ _ _ _ _ _ _ _ _<===>query independent
90. If a k-dimensional unit is dense , then its projections are in _ _ _ _ _ _ _dimensional space.<===>(k-1)
91. If a rule concerns associations between the presence or absence of items, it is a _ _<===>Boolean association
92. If a rule describes association between quantitative attributes, it is a _ _ _ _ _ _ _ _ _<===>Quantitative
93. If a set of rules has same condent,then the rule with highest confidence is selected as _ _ _ to represent the set<===>possible rule
94. If a single distinct predicate exists in single dimensional association rule , it is also called as _ _ _<===>intra dimension association rule
95. If an arc is drawn from node A to node B then A is _ _ _ _ _ _ _ _ _ of B i)Parent ii)immediate predecessor iii)descendent iv)immediate successor<===>i & ii
96. If in multi dimensional association rule with repeated predicates, which contains multiple occurrences of some predicate certain rules are called as _ _ _ _ _ _ _ _ _<===>hybrid association rule
97. If in the item set {percent <= "59", placement ="no" } whose support increases from 0.7 % is c1 to 92.6 % in c2, the growth rate is _ _ _ _ _ _<===>92.6 %/0.7 %
98. If no repeated predicates exists in multi dimensional association rule is also called as<===>inter dimension association rule
99. If the time interval "int=0" means _ _ _ _ _ _ _ _ _ _<===>no interval gap is allowed
100. If the transactional data is The minimum support count is _ _ _ _ _ _<===>2
101. If there are 'm' number of objects within d-neighborhood of an outlier and later it is decided as not an outlier because of its _ _ _ _ _ _ _ _ _ number of neighbors.<===>(m+1)
102. If X=>Y[a=50 %,b=2 %] a(X=>Y) = _ _ _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _ and<===>P(AUB),P(B/A)
103. In _ _ _ _ _ _ _ _ _ signature, its image includes a composition of multiple features.<===>multi feature composed
104. In _ _ _ _ _ _ _ _ _ the class label of each training sample is not known, and the number or set of classes to be learned may not be known in advance.<===>unsupervised learning
105. In _ _ _ _ _ _ _ _ the class distribution of the samples in each fold is approximately the same as that in the initial data.<===>stratified cross-validation
106. In a decision tree, _ _ _ _ _ _ _ _ represents an outcome of the test.<===>branch
107. In a multilayer feed-forward NN the weighted output of hidden layer are inputs to _<===>output layer
108. In cell-based algorithm, if k represents dimensionality and c is a constant. Its complexity is defined as _ _ _ _ _ _ _ _ _ _<===>O(+n)
109. In Gaussian density function stands for _ _ _ _ _ _ & _ _ _ _ _ _ _ _<===>mean & SD
110. In how many approaches does tree pruning work?<===>2
111. In index-based algorithm, if k represents dimensionality and n represents number of objects in the data set. The worst-case complexity is _ _ _ _ _ _ _ _ _ _ _<===>O(k*)
112. In which operation Substring from pairs of rules are swapped to form new pair of rules<===>crossover
113. In Y= X, are _ _ _ _ _ _ _ _<===>regression coefficients
114. Interval-scaled variables are _ _ _ _ _ _ _ _ measurements of a linear scale.<===>continuous
115. Into how many independent sets the given data are randomly partitioned in holdout method?<===>2
116. is _ _ _ _ _ _ _ _ _ _ _<===>Jaccard coefficient
117. is _ _ _ _ _ _ _ _<===>simple matching coefficient
118. It is difficult to construct an object cube containing _ _ _ _ _ _ _ _ _ _ dimension<===>keyword
119. JEP is a special case of EP , where J stands for _ _ _ _ _ _ _ _ _ _ _ _ _ _<===>jumping
120. Let A[X] be the set of 'n' tuples t1,t2, _ _ _ _ _ _ _ _ tn projected on the attribute set X . Which measure of A[X] is the average pair wise distance between the tuples projected on X?<===>diameter
121. MBR stands for _ _ _ _ _ _ _ _ _ _ _ which represents 2 points for rough estimation of a merged region.<===>minimum bounding rectangle
122. Mining _ _ _ _ _ _ _ _ _ _ specifies the periodic behavior of the time series at some, but not all of the points in time.<===>partial periodic pattern.
123. Most of the partitioning methods cluster objects are based on _ _ _ _ _<===>distance between objects
124. Nave Bayesian classifier is also called as _ _ _ _ _ _ _ _ _ _ classifier.<===>simple Bayesian
125. Neural n/w learning is also referred to as _ _ _ _ _ _ _ _ _ learning<===>connectionist
126. Normalization fall within a range of _ _ _ _ _ _ _ _ _<===>-1.0 to +1.0
127. Object-by-object structure is also known as _ _ _ _ _ _<===>Dissimilarity matrix
128. Outlier detection and outlier analysis is a data mining task referred to as _ _ _<===>Outlier mining
129. P(H/X) is a _ _ _ _ _ _ _ _ probability.<===>posterior
130. Percent(A,"70,71 _ _ _ 80") => placement(A,"Infosys") percent(A,"70,71 _ _ _ 80") => placement(A, "Microsoft") percent(A,"70,71 _ _ _ _ 80") => placement(A," Dell") percent(A,"70,71 _ _ _ _ 80") => placement(A,"IBM") These set of rules clearly refer to _ _ _ _ _ _ _ _ _ _ _ _ _ _ rule<===>Multi dimensional association
131. Preprocessing of data in preparation for classification and prediction can involve ------ for normalizing the data.<===>data transformation
132. Rough<===>equivalence
133. Sensitivity & specificity can be used to measure _ _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _ _<===>+ve samples & -ve samples
134. SOM stands for _ _ _ _ _ _ _ _ _ _<===>self organizing maps
135. Sp = calculates _ _ _ _ _ _ _ _ _ _<===>mean absolute deviation
136. Square wave influence function is set to 0(zero) if _ _<===>d(x,y)> σ
137. STING is _ _ _ _ _ _ _ _ clustering algorithm<===>grid-based methods
138. The _ _ _ _ _ _ _ _ _ _ is an object is defined as the sum of influence functions of all data points<===>density function
139. The _ _ _ _ _ _ _ _ states that discordant values are not outliers in distribution F, but contaminants from some other distribution G.<===>mixture alternative distribution
140. The _ _ _ _ _ _ algorithm where each cluster is represented byy one of the objects located near the center of cluster.<===>k-medoids
141. The absolute closeness between 2 clusters, normalized w.r.t the internal closeness of two clusters is _ _<===>relative closeness
142. The agglomerative approach is also called as _ _ _ _ _ approach.<===>bottom-up
143. The algorithm of "Training Bayesian Belief Networks" involve which sequence of steps i)compute the gradients ii)renormalize the weights iii)update the weights<===>i, iii, ii
144. The basic algorithm for decision tree induction is _ _ _ _ _ _ _<===>greedy algorithm
145. the computational complexity of CLARANS is _ _ _ _ _ _ _ _ _ _<===>O(n2)
146. The constraint "max(I.marks)<=600" is acceptable by _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _ categories.<===>antimonotone,succinct
147. The constraint "support(s) is acceptable by _ _ _ _ _ _ _ _ _ _ category.<===>monotone
148. The correlation between the occurrence of A and B can be measured by computing _ _<===>corrA,B = P(A∪B) / (P(A)P(B))u
149. The data matrix is often called as _ _ _ _ _ _ _<===>two-mode matrix
150. The data tuples analyzed to build the model collectively form _ _ _<===>training data set
151. The following rule Age(A,"20,21 _ _ _ _ 27") Λ percent(A,"60,61 _ _ _ 80") Λ test(A,"B,B+ _ _ _ _ .A+) =>placement(A,"MNCs") is an example of _ _ _ _ _ _ _ _<===>Multi dimensional association
152. The larger the value, the greater the proportion of class members that share attribute-value pair This is applicable t o _ _ _ _ _ _ _ _ _ _ _ _ _<===>intraclass similarity
153. The partitioning of process is referred to as binning and the intervals are considered as _ _ _<===>bins
154. The process of grouping a set of physical objects into classes of similar objects is called as –<===>clusterin
155. The rule "IF NOT A1 AND A2 THEN NOT C1 "is encoded as<===>010
156. The sequence of steps in conceptual clustering are _ _ _ _ _ _ _ _ _ _<===>clustering,characterization
157. The set X=>Y [a=50 %,b=2 %] is a _ _ _ _ _ _ _ _ _ _ _ item set.<===>2
158. The statement "If any student age is above 20 and their percentage is 70 or more are approved for placement " the rule in rough set theory can be written as _ _ _ _ _ _<===>if(x, age > 20) Λ if(x, percentage >= 70) then placement =approved
159. The tech of updating the weights & biases after the presentation of each sample is<===>case updating
160. TID stands for _ _ _ _ _ _ _ _ _ _ _<===>Transaction is associated with an identifier
161. Time series analysis is also referred to as _ _ _ _ _ _ _ _<===>decomposition
162. Training set & test set are two sets of _ _ _ _ _ _ _ _ _ _ _<===>holdout method
163. Transaction reduction implies _ _ _ _ _ _<===>reducing the number of transactions in the future iteration
164. Wave Cluster is _ _ _ _ _ _ _ _ algorithm from the following. i)hierarchical ii)density-basediii)grid-based<===>ii & iii
165. Web linkage structures, web contents etc., are included in _ _ _ _ _ _ _ _ _<===>web mining
166. When multi-level association rules are mined, some of the rules found will be redundant due to _ _ _ _ _ _ _ _ _ _ _ relationships between them.<===>ancestor
167. where m and p are _ _ _ _ _ _ _ _ _ _ _ _<===>number of matches and number of variables
168. Which algorithm facilitates parallel processing?<===>STING
169. Which algorithm is included with certain series of walks through itemset space?<===>random walk through algorithm
170. Which association rule has overcome the disadvantage of Association rules?<===>Distance-based association rules
171. Which categories can be used during association mining to guide the process, leading to more efficient and effective mining?<===>antimonotone, monotone, succinct, convertible
172. Which constraint specify the set of task-relevant data?<===>data constraints
173. Which constraints are applied before mining?<===>knowledge type and data constraints
174. Which constraints may be expressed as Metarules?<===>rule constraints
175. Which method doesn't handle categorical attributes?<===>CURE
176. Which method of estimating classifier samples the given the training instances uniformly with replacement?<===>boot strapping
177. Which method overcame with the problem of favoring clusters with spherical shape and similar sizes?<===>CURE
178. Which method represents each cluster by a certain fixed number of representative objects and shrinks them towards the center of the cluster<===>CURE
179. Which of the following are spatial operators? i)spatial-union ii)spatial-overlapping iii)spatial-intersection iv)spatial-disjoint<===>i,ii,iii
180. Which of the following criteria is not the one used for the comparison of classification and prediction?<===>data cleaning
181. Which regression helps in counting the data frequently?<===>Poisson regression
182. Which step could involve huge computations?<===>prune step
183. Which threshold can be set up for passing down relatively frequent items to lower<===>level-class threshold
184. While _ _ _ _ _ _ _ predicts class, _ _ _ _ _ _ _ models continuous-valued functions.<===>classification-prediction
2. _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _are the techniques for approving overall classifier accuracy by learning and combining series of individual classifiers.<===>bagging & boosting
3. _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ are the two major types of prediction problems.<===>classification-regression
4. _ _ _ _ _ _ _ _ _ _ _ _ is a signal processing technique that decomposes a signal into different frequency sub bands.<===>Wavelet transform
5. _ _ _ _ _ _ _ _ _ _ _ are modified so as to minimize the mean squared error b/w the networks prediction and the actual class<===>weights of the nodes
6. _ _ _ _ _ _ _ _ _ _ _ is a two step process<===>data classification
7. _ _ _ _ _ _ _ _ _ _ _ methods use statistical measures to remove the least reliable<===>tree pruning
8. _ _ _ _ _ _ _ _ _ _ _ performs multidimensional clustering in 2 steps.<===>CLIQUE
9. _ _ _ _ _ _ _ _ _ _ _ verifies whether an object is significantly large or small in relation to the distribution F.<===>discordancy test
10. _ _ _ _ _ _ _ _ _ _ is a collection of pointers to spatial objects.<===>spatial measures
11. _ _ _ _ _ _ _ _ _ _ is used to identify outliers w.r.t the model.<===>discordancy test
12. _ _ _ _ _ _ _ _ _ _ methods quantize the object space into a finite number of cells that form a grid structure.<===>grid-based methods
13. _ _ _ _ _ _ _ _ _ _ notation be used to represent sequence of actions of the same type.<===>+
14. _ _ _ _ _ _ _ _ _ _ uses the concept to generalize the data by replacing lower-level data with high-level concepts.<===>attribute oriented induction
15. _ _ _ _ _ _ _ _ _ and _ _ _ _ _ _ _ _ _ are two operators of COBWEB.<===>merging, splitting
16. _ _ _ _ _ _ _ _ _ can be used to find the most "natural" number of clusters using a silhouette coefficient.<===>CLARANS
17. _ _ _ _ _ _ _ _ _ doesn't require a metric distance between the objects.<===>dissimilarity f unction
18. _ _ _ _ _ _ _ _ _ find all pairs of gap free windows of a small length that are similar.<===>atomic matching
19. _ _ _ _ _ _ _ _ _ is a dimension whose primitive level data are spatial starting at a certain level, becomes non-spatial.<===>spatial to non-spatial dimensions
20. _ _ _ _ _ _ _ _ _ is a task of mining significant patterns from a plan base.<===>plan mining
21. _ _ _ _ _ _ _ _ _ is used to access the percentage of samples.<===>precision
22. _ _ _ _ _ _ _ _ _ is used to improve the efficient of the Apriori algorithm.<===>Iceberg queries
23. _ _ _ _ _ _ _ _ _ refers to the preprocessing of data in order to remove noise.<===>data cleaning
24. _ _ _ _ _ _ _ _ _ stores and manages a large collection of multimedia objects.<===>multimedia database system
25. _ _ _ _ _ _ _ _ and _ _ _ _ _ _ _ _ _ are important means of generalization .<===>aggregation, approximation
26. _ _ _ _ _ _ _ _ can be modeled by adding polynomial terms to the basic linear mode.<===>polynomial regression
27. _ _ _ _ _ _ _ _ is a optimization method for spatial association analysis.<===>progressive refinement
28. _ _ _ _ _ _ _ _ is one where properties can be inherited from more than one super class.<===>multiple inheritance
29. _ _ _ _ _ _ _ _ is the database that consists of sequence of ordered events with / without concrete notions of time.<===>sequence database
30. _ _ _ _ _ _ _ _ refer to the cycles which are long term oscillations about a trend line/curve which may/may not be periodic.<===>cyclic movements
31. _ _ _ _ _ _ _ _ regression can be modeled by adding polynomial terms to the basic linear model<===>polynomial
32. _ _ _ _ _ _ _ can automatically result in the removal of outliers.<===>Wavelet transform
33. _ _ _ _ _ _ _ compete in a "winner-take-all" fashion for the object that is currently presented to the system<===>competitive learning
34. _ _ _ _ _ _ _ has the ability to automatically adjust the number of classes in a partition.<===>COBWEB
35. _ _ _ _ _ _ _ is a density-based method that computers an augmented clustering.<===>OPTICS
36. _ _ _ _ _ _ _ is a simple technique that uses a test set of class-labeled samples.<===>holdout method
37. _ _ _ _ _ are used to incorporate ideas if natural evolution<===>genetic algorithms
38. _ _ _ _ if(x, age > 20) Λ if(x, percentage >= 70) then placement = approved _ _ _ _ _ _ _ is defined in terms of Euclidean distance<===>Closeness between 2 points
39. _ _ _ _ refers to the extraction of knowledge spatial relationships not explicitly stored in spatial databases.<===>spatial data mining
40. A _ _ _ _ _ _ _ _ _ _ has complex tasks, graphics, images, videos, maps, voice, music etc.<===>multimedia database
41. A _ _ _ _ _ _ _ _ _ resembles a nominal variable.<===>discrete ordinal variable
42. A constraint "max(I.marks) >=600" is acceptable for _ _ _ _ _ _ & _ _ _ _ _ _ _categories.<===>monotone,succinct
43. A constraint such as "avg(I.marks) <= 70" is not a(n) _ _ _ _ _ _ _ _ _ _ _<===>anti-monotone
44. A huge amount of space-related data are in _ _ _ _ _ _ _ _ _ forms.<===>images
45. A Neural Network containing N hidden Layers is called as _ _ _ _ _ _ _ _ _ _ Neural network layered<===>(N+1)
46. A set of items is referred to as a(n) _ _ _ _ _ _ _ _ _ _<===>itemset
47. A set-valued attribute may be _ _ _ _ _ _ _ i) homogenous ii)heterogeneous<===>i or ii
48. Accuracy estimates also help in _ _ _ _ _ _ _ _ _<===>comparison of different classifiers
49. Accuracy is given as _ _ _ _ _ _ _ _ _ _<===>specificity * (neg/(pos+neg)) + sensitivity *(pos/(pos+neg))
50. AGNES is expanded to _ _ _ _ _ _ _ _<===>agglomerative nesting
51. Anti-monotone states _ _ _ _ _ _ _ _<===>if a set cannot pass a test, all its supersets also cannot pass the same test
52. Anti-monotone, monotone, succinct, convertible and inconvertible are five different categories of _ _ _ _ _ _ _ _ _ _ constraints.<===>rule constraints
53. Apart from prediction, the log linier model is also useful for _ _ _ _ _<===>data compression
54. Apriori algorithm employs level-wise search, where k-itemsets uses ------ itemsets.<===>(k+1)
55. Assume the fallow data X(years experience ) y( salary) 02 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 13000 09 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 45000 15 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 75000 03 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 18000 Its β value is _ _ _ _ _ _ _ _ _ _ _ _ _ _ _<===>4698.6
56. Assume the following salary details: X(years experience) Y(salary in Rs.)2 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 13000 9 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 45000 15 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 75000 4 ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; 18000 The value of x = _ _ _ _ _ _ _ _ _<===>8.5
57. Back propagation is a neural n/w _ _ _ _ _ _ _ _ _ _ _ algorithm<===>learning
58. Bayes theorem provides a way of calculating which probability<===>posterior
59. Block and consecutive procedures are 2 basic types for _ _ _ _ _ _ _ _ _ _ _<===>detecting outliers
60. Bootstrap is also known as _ _ _ _ _ _ _ _ _<===>bagging
61. car => financial from bank [loan=80 %, insurance=20 %] _ _ _ _ _ _ _ & _ _ _ _ _ are two measures of Association rules.<===>loan,insurance
62. car =>financial from bank [loan=80 %,insuranace=20 %] Association rules are satisfied if they have _ _<===>maximum loan threshold, minimum insurance threshold
63. CF tree has how many parameters?<===>2
64. Classification threshold is also called as _ _ _ _ _ _ _ _ _<===>precision threshold
65. CLASSIT and Auto class are _ _ _ _ _ _ _ _ _ _<===>statistical approaches
66. Clustering is a form of _ _ _ _ _ _ _ _ _ _ _<===>learning by observation
67. Clustering large applications can be shortened as _ _ _ _ _<===>CLARA
68. Consider a spatial association rule Is-a(X,"office") Λ near-by(X,"house") => near-by(X,"university") [ 0.5 % 80 %] The rule states _ _ _ _ _ _ _ _ percent of the offices are close to houses.<===>80 %
69. Consider the data Original data: 3 7 2 0 7 2 Moving average of order 3: 4 3 2 6 Weighted(3,4,3) The first weighted average value is _ _ _ _ _ _ _ _ _ _ _<===>4.3
70. Consider the following rile Age(A,"18,19, _ _ _ _ _ 29") Λ placement(A,"Infosys,IBM, _ _ _ _ ") Λ purchases(A,"mobile") => purchases(A,"high memory card") Λ purchases(A,"card reader") This rule is highlighted in saying that it has _ _ _ _ _ _ _ _<===>repetitive predicate
71. Consider the following rule: If an engineering student inWarangal bought "speech recognition CD" and "MS Office" and "jdk 1.7", it is likely (with a probability of 58 %) that the student also bought SQL Server and "My SQL Server" and 6.5 % of all the students bought all five. The meta rule can be generated in association rule as _ _ _ _<===>lives(S, _,Warangal") Λ sales(S,"speech recognition", _) Λ sales(S,"MS Office", _) Λ sales(S,"jdk 1.7", _) =>sales(S,"SQL Server", _) Λ sales(S,"My SQL Server", _) [6.5 % 58 %]
72. CPT stands for _ _ _ _ _ _ _ _ _ _ _<===>Conditional probability table
73. Data which are inconsistent with the remaining set of data is called as _ _ _ _<===>Outliers
74. DBSCAN is _ _ _ _ _ _ _ _ _ clustering algorithm<===>density-based methods
75. Decision trees can easily be converted to _ _ _ _ _ _ _ rules.<===>If-THEN
76. Density connectivity is a _ _ _ _ _ _ _ _ relation.<===>symmetric
77. DFT and DWT are two popular data-independent transformations where F stands for _ _ _ _ _ _ _ _ _ _ _ _<===>Fourier
78. During Bayesian n/w s incomplete data is referred to _ _ _ _ _ _ _ _<===>Hidden data
79. During the construction of decision tree induction the tree starts as _ _ _ _ _ _<===>single node
80. Each attribute to simple-valued data for constructing a multi-dimensional data cube are called as _ _ _ _ _ _ _ _ _ _ _<===>object cube
81. Each object in a class is associated with _ _ _ _ _ _ _ _ _ _ _<===>object identifier & set of attributes
82. Early decision tree algorithms typically assume that the data is from _ _ _ _<===>memory
83. EM is expanded to _ _ _ _ _ _<===>expectation maximization
84. EP stands for _ _ _ _ _ _ _ _ _ _ _<===>emerging patterns
85. Erri = where are _ _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _<===>actual o/p, true o/p
86. For given n objects the complexity of CURE is _ _ _ _ _ _ _ _ _ _<===>O(n)
87. From the eg. Y= | x, the co-efficient a= ______________<===>y'-βx
88. Gaussian influence function is given as _ _ _ _ _ _ _ _ _ _ _<===>e-
89. Grid-based computation is _ _ _ _ _ _ _ _ _ _ _<===>query independent
90. If a k-dimensional unit is dense , then its projections are in _ _ _ _ _ _ _dimensional space.<===>(k-1)
91. If a rule concerns associations between the presence or absence of items, it is a _ _<===>Boolean association
92. If a rule describes association between quantitative attributes, it is a _ _ _ _ _ _ _ _ _<===>Quantitative
93. If a set of rules has same condent,then the rule with highest confidence is selected as _ _ _ to represent the set<===>possible rule
94. If a single distinct predicate exists in single dimensional association rule , it is also called as _ _ _<===>intra dimension association rule
95. If an arc is drawn from node A to node B then A is _ _ _ _ _ _ _ _ _ of B i)Parent ii)immediate predecessor iii)descendent iv)immediate successor<===>i & ii
96. If in multi dimensional association rule with repeated predicates, which contains multiple occurrences of some predicate certain rules are called as _ _ _ _ _ _ _ _ _<===>hybrid association rule
97. If in the item set {percent <= "59", placement ="no" } whose support increases from 0.7 % is c1 to 92.6 % in c2, the growth rate is _ _ _ _ _ _<===>92.6 %/0.7 %
98. If no repeated predicates exists in multi dimensional association rule is also called as<===>inter dimension association rule
99. If the time interval "int=0" means _ _ _ _ _ _ _ _ _ _<===>no interval gap is allowed
100. If the transactional data is The minimum support count is _ _ _ _ _ _<===>2
101. If there are 'm' number of objects within d-neighborhood of an outlier and later it is decided as not an outlier because of its _ _ _ _ _ _ _ _ _ number of neighbors.<===>(m+1)
102. If X=>Y[a=50 %,b=2 %] a(X=>Y) = _ _ _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _ and<===>P(AUB),P(B/A)
103. In _ _ _ _ _ _ _ _ _ signature, its image includes a composition of multiple features.<===>multi feature composed
104. In _ _ _ _ _ _ _ _ _ the class label of each training sample is not known, and the number or set of classes to be learned may not be known in advance.<===>unsupervised learning
105. In _ _ _ _ _ _ _ _ the class distribution of the samples in each fold is approximately the same as that in the initial data.<===>stratified cross-validation
106. In a decision tree, _ _ _ _ _ _ _ _ represents an outcome of the test.<===>branch
107. In a multilayer feed-forward NN the weighted output of hidden layer are inputs to _<===>output layer
108. In cell-based algorithm, if k represents dimensionality and c is a constant. Its complexity is defined as _ _ _ _ _ _ _ _ _ _<===>O(+n)
109. In Gaussian density function stands for _ _ _ _ _ _ & _ _ _ _ _ _ _ _<===>mean & SD
110. In how many approaches does tree pruning work?<===>2
111. In index-based algorithm, if k represents dimensionality and n represents number of objects in the data set. The worst-case complexity is _ _ _ _ _ _ _ _ _ _ _<===>O(k*)
112. In which operation Substring from pairs of rules are swapped to form new pair of rules<===>crossover
113. In Y= X, are _ _ _ _ _ _ _ _<===>regression coefficients
114. Interval-scaled variables are _ _ _ _ _ _ _ _ measurements of a linear scale.<===>continuous
115. Into how many independent sets the given data are randomly partitioned in holdout method?<===>2
116. is _ _ _ _ _ _ _ _ _ _ _<===>Jaccard coefficient
117. is _ _ _ _ _ _ _ _<===>simple matching coefficient
118. It is difficult to construct an object cube containing _ _ _ _ _ _ _ _ _ _ dimension<===>keyword
119. JEP is a special case of EP , where J stands for _ _ _ _ _ _ _ _ _ _ _ _ _ _<===>jumping
120. Let A[X] be the set of 'n' tuples t1,t2, _ _ _ _ _ _ _ _ tn projected on the attribute set X . Which measure of A[X] is the average pair wise distance between the tuples projected on X?<===>diameter
121. MBR stands for _ _ _ _ _ _ _ _ _ _ _ which represents 2 points for rough estimation of a merged region.<===>minimum bounding rectangle
122. Mining _ _ _ _ _ _ _ _ _ _ specifies the periodic behavior of the time series at some, but not all of the points in time.<===>partial periodic pattern.
123. Most of the partitioning methods cluster objects are based on _ _ _ _ _<===>distance between objects
124. Nave Bayesian classifier is also called as _ _ _ _ _ _ _ _ _ _ classifier.<===>simple Bayesian
125. Neural n/w learning is also referred to as _ _ _ _ _ _ _ _ _ learning<===>connectionist
126. Normalization fall within a range of _ _ _ _ _ _ _ _ _<===>-1.0 to +1.0
127. Object-by-object structure is also known as _ _ _ _ _ _<===>Dissimilarity matrix
128. Outlier detection and outlier analysis is a data mining task referred to as _ _ _<===>Outlier mining
129. P(H/X) is a _ _ _ _ _ _ _ _ probability.<===>posterior
130. Percent(A,"70,71 _ _ _ 80") => placement(A,"Infosys") percent(A,"70,71 _ _ _ 80") => placement(A, "Microsoft") percent(A,"70,71 _ _ _ _ 80") => placement(A," Dell") percent(A,"70,71 _ _ _ _ 80") => placement(A,"IBM") These set of rules clearly refer to _ _ _ _ _ _ _ _ _ _ _ _ _ _ rule<===>Multi dimensional association
131. Preprocessing of data in preparation for classification and prediction can involve ------ for normalizing the data.<===>data transformation
132. Rough<===>equivalence
133. Sensitivity & specificity can be used to measure _ _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _ _<===>+ve samples & -ve samples
134. SOM stands for _ _ _ _ _ _ _ _ _ _<===>self organizing maps
135. Sp = calculates _ _ _ _ _ _ _ _ _ _<===>mean absolute deviation
136. Square wave influence function is set to 0(zero) if _ _<===>d(x,y)> σ
137. STING is _ _ _ _ _ _ _ _ clustering algorithm<===>grid-based methods
138. The _ _ _ _ _ _ _ _ _ _ is an object is defined as the sum of influence functions of all data points<===>density function
139. The _ _ _ _ _ _ _ _ states that discordant values are not outliers in distribution F, but contaminants from some other distribution G.<===>mixture alternative distribution
140. The _ _ _ _ _ _ algorithm where each cluster is represented byy one of the objects located near the center of cluster.<===>k-medoids
141. The absolute closeness between 2 clusters, normalized w.r.t the internal closeness of two clusters is _ _<===>relative closeness
142. The agglomerative approach is also called as _ _ _ _ _ approach.<===>bottom-up
143. The algorithm of "Training Bayesian Belief Networks" involve which sequence of steps i)compute the gradients ii)renormalize the weights iii)update the weights<===>i, iii, ii
144. The basic algorithm for decision tree induction is _ _ _ _ _ _ _<===>greedy algorithm
145. the computational complexity of CLARANS is _ _ _ _ _ _ _ _ _ _<===>O(n2)
146. The constraint "max(I.marks)<=600" is acceptable by _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _ categories.<===>antimonotone,succinct
147. The constraint "support(s) is acceptable by _ _ _ _ _ _ _ _ _ _ category.<===>monotone
148. The correlation between the occurrence of A and B can be measured by computing _ _<===>corrA,B = P(A∪B) / (P(A)P(B))u
149. The data matrix is often called as _ _ _ _ _ _ _<===>two-mode matrix
150. The data tuples analyzed to build the model collectively form _ _ _<===>training data set
151. The following rule Age(A,"20,21 _ _ _ _ 27") Λ percent(A,"60,61 _ _ _ 80") Λ test(A,"B,B+ _ _ _ _ .A+) =>placement(A,"MNCs") is an example of _ _ _ _ _ _ _ _<===>Multi dimensional association
152. The larger the value, the greater the proportion of class members that share attribute-value pair This is applicable t o _ _ _ _ _ _ _ _ _ _ _ _ _<===>intraclass similarity
153. The partitioning of process is referred to as binning and the intervals are considered as _ _ _<===>bins
154. The process of grouping a set of physical objects into classes of similar objects is called as –<===>clusterin
155. The rule "IF NOT A1 AND A2 THEN NOT C1 "is encoded as<===>010
156. The sequence of steps in conceptual clustering are _ _ _ _ _ _ _ _ _ _<===>clustering,characterization
157. The set X=>Y [a=50 %,b=2 %] is a _ _ _ _ _ _ _ _ _ _ _ item set.<===>2
158. The statement "If any student age is above 20 and their percentage is 70 or more are approved for placement " the rule in rough set theory can be written as _ _ _ _ _ _<===>if(x, age > 20) Λ if(x, percentage >= 70) then placement =approved
159. The tech of updating the weights & biases after the presentation of each sample is<===>case updating
160. TID stands for _ _ _ _ _ _ _ _ _ _ _<===>Transaction is associated with an identifier
161. Time series analysis is also referred to as _ _ _ _ _ _ _ _<===>decomposition
162. Training set & test set are two sets of _ _ _ _ _ _ _ _ _ _ _<===>holdout method
163. Transaction reduction implies _ _ _ _ _ _<===>reducing the number of transactions in the future iteration
164. Wave Cluster is _ _ _ _ _ _ _ _ algorithm from the following. i)hierarchical ii)density-basediii)grid-based<===>ii & iii
165. Web linkage structures, web contents etc., are included in _ _ _ _ _ _ _ _ _<===>web mining
166. When multi-level association rules are mined, some of the rules found will be redundant due to _ _ _ _ _ _ _ _ _ _ _ relationships between them.<===>ancestor
167. where m and p are _ _ _ _ _ _ _ _ _ _ _ _<===>number of matches and number of variables
168. Which algorithm facilitates parallel processing?<===>STING
169. Which algorithm is included with certain series of walks through itemset space?<===>random walk through algorithm
170. Which association rule has overcome the disadvantage of Association rules?<===>Distance-based association rules
171. Which categories can be used during association mining to guide the process, leading to more efficient and effective mining?<===>antimonotone, monotone, succinct, convertible
172. Which constraint specify the set of task-relevant data?<===>data constraints
173. Which constraints are applied before mining?<===>knowledge type and data constraints
174. Which constraints may be expressed as Metarules?<===>rule constraints
175. Which method doesn't handle categorical attributes?<===>CURE
176. Which method of estimating classifier samples the given the training instances uniformly with replacement?<===>boot strapping
177. Which method overcame with the problem of favoring clusters with spherical shape and similar sizes?<===>CURE
178. Which method represents each cluster by a certain fixed number of representative objects and shrinks them towards the center of the cluster<===>CURE
179. Which of the following are spatial operators? i)spatial-union ii)spatial-overlapping iii)spatial-intersection iv)spatial-disjoint<===>i,ii,iii
180. Which of the following criteria is not the one used for the comparison of classification and prediction?<===>data cleaning
181. Which regression helps in counting the data frequently?<===>Poisson regression
182. Which step could involve huge computations?<===>prune step
183. Which threshold can be set up for passing down relatively frequent items to lower<===>level-class threshold
184. While _ _ _ _ _ _ _ predicts class, _ _ _ _ _ _ _ models continuous-valued functions.<===>classification-prediction
No comments:
Post a Comment