Read Books Online and Download eBooks, EPub, PDF, Mobi, Kindle, Text Full Free.
Efficient And Exact Computation Of Inclusion Dependencies For Data Integration
Download Efficient And Exact Computation Of Inclusion Dependencies For Data Integration full books in PDF, epub, and Kindle. Read online Efficient And Exact Computation Of Inclusion Dependencies For Data Integration ebook anywhere anytime directly on your device. Fast Download speed and no annoying ads. We cannot guarantee that every ebooks is available!
Book Synopsis Efficient and Exact Computation of Inclusion Dependencies for Data Integration by : Jana Bauckmann
Download or read book Efficient and Exact Computation of Inclusion Dependencies for Data Integration written by Jana Bauckmann and published by Universitätsverlag Potsdam. This book was released on 2010 with total page 46 pages. Available in PDF, EPUB and Kindle. Book excerpt: Data obtained from foreign data sources often come with only superficial structural information, such as relation names and attribute names. Other types of metadata that are important for effective integration and meaningful querying of such data sets are missing. In particular, relationships among attributes, such as foreign keys, are crucial metadata for understanding the structure of an unknown database. The discovery of such relationships is difficult, because in principle for each pair of attributes in the database each pair of data values must be compared. A precondition for a foreign key is an inclusion dependency (IND) between the key and the foreign key attributes. We present with Spider an algorithm that efficiently finds all INDs in a given relational database. It leverages the sorting facilities of DBMS but performs the actual comparisons outside of the database to save computation. Spider analyzes very large databases up to an order of magnitude faster than previous approaches. We also evaluate in detail the effectiveness of several heuristics to reduce the number of necessary comparisons. Furthermore, we generalize Spider to find composite INDs covering multiple attributes, and partial INDs, which are true INDs for all but a certain number of values. This last type is particularly relevant when integrating dirty data as is often the case in the life sciences domain - our driving motivation.
Book Synopsis Covering Or Complete? by : Jana Bauckmann
Download or read book Covering Or Complete? written by Jana Bauckmann and published by Universitätsverlag Potsdam. This book was released on 2012 with total page 40 pages. Available in PDF, EPUB and Kindle. Book excerpt: Data dependencies, or integrity constraints, are used to improve the quality of a database schema, to optimize queries, and to ensure consistency in a database. In the last years conditional dependencies have been introduced to analyze and improve data quality. In short, a conditional dependency is a dependency with a limited scope defined by conditions over one or more attributes. Only the matching part of the instance must adhere to the dependency. In this paper we focus on conditional inclusion dependencies (CINDs). We generalize the definition of CINDs, distinguishing covering and completeness conditions. We present a new use case for such CINDs showing their value for solving complex data quality tasks. Further, we define quality measures for conditions inspired by precision and recall. We propose efficient algorithms that identify covering and completeness conditions conforming to given quality thresholds. Our algorithms choose not only the condition values but also the condition attributes automatically. Finally, we show that our approach efficiently provides meaningful and helpful results for our use case.
Download or read book CSOM/PL written by Michael Haupt and published by Universitätsverlag Potsdam. This book was released on 2011 with total page 38 pages. Available in PDF, EPUB and Kindle. Book excerpt: Business process models are abstractions of concrete operational procedures that occur in the daily business of organizations. To cope with the complexity of these models, business process model abstraction has been introduced recently. Its goal is to derive from a detailed process model several abstract models that provide a high-level understanding of the process. While techniques for constructing abstract models are reported in the literature, little is known about the relationships between process instances and abstract models. In this paper we show how the state of an abstract activity can be calculated from the states of related, detailed process activities as they happen. The approach uses activity state propagation. With state uniqueness and state transition correctness we introduce formal properties that improve the understanding of state propagation. Algorithms to check these properties are devised. Finally, we use behavioral profiles to identify and classify behavioral inconsistencies in abstract process models that might occur, once activity state propagation is used.
Book Synopsis Business Process Model Abstraction by : Sergey Smirnov
Download or read book Business Process Model Abstraction written by Sergey Smirnov and published by Universitätsverlag Potsdam. This book was released on 2010 with total page 26 pages. Available in PDF, EPUB and Kindle. Book excerpt: Business process management aims at capturing, understanding, and improving work in organizations. The central artifacts are process models, which serve different purposes. Detailed process models are used to analyze concrete working procedures, while high-level models show, for instance, handovers between departments. To provide different views on process models, business process model abstraction has emerged. While several approaches have been proposed, a number of abstraction use case that are both relevant for industry and scientifically challenging are yet to be addressed. In this paper we systematically develop, classify, and consolidate different use cases for business process model abstraction. The reported work is based on a study with BPM users in the health insurance sector and validated with a BPM consultancy company and a large BPM vendor. The identified fifteen abstraction use cases reflect the industry demand. The related work on business process model abstraction is evaluated against the use cases, which leads to a research agenda.
Book Synopsis Selected Papers of the International Workshop on Smalltalk Technologies by : Michael Haupt
Download or read book Selected Papers of the International Workshop on Smalltalk Technologies written by Michael Haupt and published by Universitätsverlag Potsdam. This book was released on 2010 with total page 48 pages. Available in PDF, EPUB and Kindle. Book excerpt: The goal of the IWST workshop series is to create and foster a forum around advancements of or experience in Smalltalk. The workshop welcomes contributions to all aspects, theoretical as well as practical, of Smalltalk-related topics.
Book Synopsis Data in Business Processes by : Andreas Meyer
Download or read book Data in Business Processes written by Andreas Meyer and published by Universitätsverlag Potsdam. This book was released on 2011 with total page 50 pages. Available in PDF, EPUB and Kindle. Book excerpt: Prozesse und Daten sind gleichermaßen wichtig für das Geschäftsprozessmanagement. Prozessdaten sind dabei insbesondere im Kontext der Automatisierung von Geschäftsprozessen, dem Prozesscontrolling und der Repräsentation der Vermögensgegenstände von Organisationen relevant. Es existieren viele Prozessmodellierungssprachen, von denen jede die Darstellung von Daten durch eine fest spezifizierte Menge an Modellierungskonstrukten ermöglicht. Allerdings unterscheiden sich diese Darstellungenund damit der Grad der Datenmodellierung stark untereinander. Dieser Report evaluiert verschiedene Prozessmodellierungssprachen bezüglich der Unterstützung von Datenmodellierung. Als einheitliche Grundlage entwickeln wir ein Framework, welches prozess- und datenrelevante Aspekte systematisch organisiert. Die Kriterien legen dabei das Hauptaugenmerk auf die datenrelevanten Aspekte. Nach Einführung des Frameworks vergleichen wir zwölf Prozessmodellierungssprachen gegen dieses. Wir generalisieren die Erkenntnisse aus den Vergleichen und identifizieren Cluster bezüglich des Grades der Datenmodellierung, in welche die einzelnen Sprachen eingeordnet werden.
Book Synopsis Proceedings of the ... Ph. D. Retreat of the HPI Research School on Service-Oriented Systems Engineering by : Christoph Meinel
Download or read book Proceedings of the ... Ph. D. Retreat of the HPI Research School on Service-Oriented Systems Engineering written by Christoph Meinel and published by Universitätsverlag Potsdam. This book was released on 2011 with total page 240 pages. Available in PDF, EPUB and Kindle. Book excerpt:
Book Synopsis State Propagation in Abstracted Business Processes by : Sergey Smirnov
Download or read book State Propagation in Abstracted Business Processes written by Sergey Smirnov and published by Universitätsverlag Potsdam. This book was released on 2011 with total page 26 pages. Available in PDF, EPUB and Kindle. Book excerpt: Business process models are abstractions of concrete operational procedures that occur in the daily business of organizations. To cope with the complexity of these models, business process model abstraction has been introduced recently. Its goal is to derive from a detailed process model several abstract models that provide a high-level understanding of the process. While techniques for constructing abstract models are reported in the literature, little is known about the relationships between process instances and abstract models. In this paper we show how the state of an abstract activity can be calculated from the states of related, detailed process activities as they happen. The approach uses activity state propagation. With state uniqueness and state transition correctness we introduce formal properties that improve the understanding of state propagation. Algorithms to check these properties are devised. Finally, we use behavioral profiles to identify and classify behavioral inconsistencies in abstract process models that might occur, once activity state propagation is used.
Book Synopsis Pattern Matching for an Object-oriented and Dynamically Typed Programming Language by : Felix Geller
Download or read book Pattern Matching for an Object-oriented and Dynamically Typed Programming Language written by Felix Geller and published by Universitätsverlag Potsdam. This book was released on 2010 with total page 100 pages. Available in PDF, EPUB and Kindle. Book excerpt: Pattern matching is a well-established concept in the functional programming community. It provides the means for concisely identifying and destructuring values of interest. This enables a clean separation of data structures and respective functionality, as well as dispatching functionality based on more than a single value. Unfortunately, expressive pattern matching facilities are seldomly incorporated in present object-oriented programming languages. We present a seamless integration of pattern matching facilities in an object-oriented and dynamically typed programming language: Newspeak. We describe language extensions to improve the practicability and integrate our additions with the existing programming environment for Newspeak. This report is based on the first author’s master’s thesis.
Book Synopsis Extracting Structured Information from Wikipedia Articles to Populate Infoboxes by : Dustin Lange
Download or read book Extracting Structured Information from Wikipedia Articles to Populate Infoboxes written by Dustin Lange and published by Universitätsverlag Potsdam. This book was released on 2010 with total page 32 pages. Available in PDF, EPUB and Kindle. Book excerpt: Roughly every third Wikipedia article contains an infobox - a table that displays important facts about the subject in attribute-value form. The schema of an infobox, i.e., the attributes that can be expressed for a concept, is defined by an infobox template. Often, authors do not specify all template attributes, resulting in incomplete infoboxes. With iPopulator, we introduce a system that automatically populates infoboxes of Wikipedia articles by extracting attribute values from the article's text. In contrast to prior work, iPopulator detects and exploits the structure of attribute values for independently extracting value parts. We have tested iPopulator on the entire set of infobox templates and provide a detailed analysis of its effectiveness. For instance, we achieve an average extraction precision of 91% for 1,727 distinct infobox template attributes.
Book Synopsis The effect of tangible media on individuals in business process modeling by : Alexander Lübbe
Download or read book The effect of tangible media on individuals in business process modeling written by Alexander Lübbe and published by Universitätsverlag Potsdam. This book was released on 2011 with total page 52 pages. Available in PDF, EPUB and Kindle. Book excerpt: In current practice, business processes modeling is done by trained method experts. Domain experts are interviewed to elicit their process information but not involved in modeling. We created a haptic toolkit for process modeling that can be used in process elicitation sessions with domain experts. We hypothesize that this leads to more effective process elicitation. This paper brakes down "effective elicitation" to 14 operationalized hypotheses. They are assessed in a controlled experiment using questionnaires, process model feedback tests and video analysis. The experiment compares our approach to structured interviews in a repeated measurement design. We executed the experiment with 17 student clerks from a trade school. They represent potential users of the tool. Six out of fourteen hypotheses showed significant difference due to the method applied. Subjects reported more fun and more insights into process modeling with tangible media. Video analysis showed significantly more reviews and corrections applied during process elicitation. Moreover, people take more time to talk and think about their processes. We conclude that tangible media creates a different working mode for people in process elicitation with fun, new insights and instant feedback on preliminary results.
Book Synopsis Proceedings of the Fall 2010 Future SOC Lab Day by : Christoph Meinel
Download or read book Proceedings of the Fall 2010 Future SOC Lab Day written by Christoph Meinel and published by Universitätsverlag Potsdam. This book was released on 2011 with total page 86 pages. Available in PDF, EPUB and Kindle. Book excerpt: In Kooperation mit Partnern aus der Industrie etabliert das Hasso-Plattner-Institut (HPI) ein "HPI Future SOC Lab", das eine komplette Infrastruktur von hochkomplexen on-demand Systemen auf neuester, am Markt noch nicht verfügbarer, massiv paralleler (multi-/many-core) Hardware mit enormen Hauptspeicherkapazitäten und dafür konzipierte Software bereitstellt. Das HPI Future SOC Lab verfügt über prototypische 4- und 8-way Intel 64-Bit Serversysteme von Fujitsu und Hewlett-Packard mit 32- bzw. 64-Cores und 1 - 2 TB Hauptspeicher. Es kommen weiterhin hochperformante Speichersysteme von EMC2 sowie Virtualisierungslösungen von VMware zum Einsatz. SAP stellt ihre neueste Business by Design (ByD) Software zur Verfügung und auch komplexe reale Unternehmensdaten stehen zur Verfügung, auf die für Forschungszwecke zugegriffen werden kann. Interessierte Wissenschaftler aus universitären und außeruniversitären Forschungsinstitutionen können im HPI Future SOC Lab zukünftige hoch-komplexe IT-Systeme untersuchen, neue Ideen / Datenstrukturen / Algorithmen entwickeln und bis hin zur praktischen Erprobung verfolgen. Dieser Technische Bericht stellt erste Ergebnisse der im Rahmen der Eröffnung des Future SOC Labs im Juni 2010 gestarteten Forschungsprojekte vor. Ausgewählte Projekte stellten ihre Ergebnisse am 27. Oktober 2010 im Rahmen der Future SOC Lab Tag Veranstaltung vor.
Book Synopsis Toward Bridging the Gap Between Formal Semantics and Implementation of Triple Graph Grammars by : Holger Giese
Download or read book Toward Bridging the Gap Between Formal Semantics and Implementation of Triple Graph Grammars written by Holger Giese and published by Universitätsverlag Potsdam. This book was released on 2010 with total page 34 pages. Available in PDF, EPUB and Kindle. Book excerpt: The correctness of model transformations is a crucial element for the model-driven engineering of high quality software. A prerequisite to verify model transformations at the level of the model transformation specification is that an unambiguous formal semantics exists and that the employed implementation of the model transformation language adheres to this semantics. However, for existing relational model transformation approaches it is usually not really clear under which constraints particular implementations are really conform to the formal semantics. In this paper, we will bridge this gap for the formal semantics of triple graph grammars (TGG) and an existing efficient implementation. Whereas the formal semantics assumes backtracking and ignores non-determinism, practical implementations do not support backtracking, require rule sets that ensure determinism, and include further optimizations. Therefore, we capture how the considered TGG implementation realizes the transformation by means of operational rules, define required criteria and show conformance to the formal semantics if these criteria are fulfilled. We further outline how static analysis can be employed to guarantee these criteria.
Book Synopsis Adaptive Windows for Duplicate Detection by : Uwe Draisbach
Download or read book Adaptive Windows for Duplicate Detection written by Uwe Draisbach and published by Universitätsverlag Potsdam. This book was released on 2012 with total page 46 pages. Available in PDF, EPUB and Kindle. Book excerpt: Duplicate detection is the task of identifying all groups of records within a data set that represent the same real-world entity, respectively. This task is difficult, because (i) representations might differ slightly, so some similarity measure must be defined to compare pairs of records and (ii) data sets might have a high volume making a pair-wise comparison of all records infeasible. To tackle the second problem, many algorithms have been suggested that partition the data set and compare all record pairs only within each partition. One well-known such approach is the Sorted Neighborhood Method (SNM), which sorts the data according to some key and then advances a window over the data comparing only records that appear within the same window. We propose several variations of SNM that have in common a varying window size and advancement. The general intuition of such adaptive windows is that there might be regions of high similarity suggesting a larger window size and regions of lower similarity suggesting a smaller window size. We propose and thoroughly evaluate several adaption strategies, some of which are provably better than the original SNM in terms of efficiency (same results with fewer comparisons).
Book Synopsis Advancing the Discovery of Unique Column Combinations by : Ziawasch Abedjan
Download or read book Advancing the Discovery of Unique Column Combinations written by Ziawasch Abedjan and published by Universitätsverlag Potsdam. This book was released on 2011 with total page 30 pages. Available in PDF, EPUB and Kindle. Book excerpt: Unique column combinations of a relational database table are sets of columns that contain only unique values. Discovering such combinations is a fundamental research problem and has many different data management and knowledge discovery applications. Existing discovery algorithms are either brute force or have a high memory load and can thus be applied only to small datasets or samples. In this paper, the wellknown GORDIAN algorithm and "Apriori-based" algorithms are compared and analyzed for further optimization. We greatly improve the Apriori algorithms through efficient candidate generation and statistics-based pruning methods. A hybrid solution HCAGORDIAN combines the advantages of GORDIAN and our new algorithm HCA, and it significantly outperforms all previous work in many situations.
Book Synopsis Survey on Healthcare IT Systems by : Christian Neuhaus
Download or read book Survey on Healthcare IT Systems written by Christian Neuhaus and published by Universitätsverlag Potsdam. This book was released on 2011 with total page 62 pages. Available in PDF, EPUB and Kindle. Book excerpt: IT systems for healthcare are a complex and exciting field. One the one hand, there is a vast number of improvements and work alleviations that computers can bring to everyday healthcare. Some ways of treatment, diagnoses and organisational tasks were even made possible by computer usage in the first place. On the other hand, there are many factors that encumber computer usage and make development of IT systems for healthcare a challenging, sometimes even frustrating task. These factors are not solely technology-related, but just as well social or economical conditions. This report describes some of the idiosyncrasies of IT systems in the healthcare domain, with a special focus on legal regulations, standards and security.
Book Synopsis Business Process Management by : Shazia Sadiq
Download or read book Business Process Management written by Shazia Sadiq and published by Springer. This book was released on 2014-08-12 with total page 449 pages. Available in PDF, EPUB and Kindle. Book excerpt: This book constitutes the proceedings of the 12th International Conference on Business Process Management, BPM 2014, held in Haifa, Israel, in September 2014. The 21 regular papers and 10 short papers included in this volume were carefully reviewed and selected from 123 submissions. The papers are organized in 9 topical sections on declarative processes, user-centered process approaches, process discovery, integrative BPM, resource and time management in BPM, process analytics, process enabled environments, discovery and monitoring, and industry papers.