Conferences related to Mapreduce

Back to Top

2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)

Cluster Computing, Grid Computing, Edge Computing, Cloud Computing, Parallel Computing, Distributed Computing


2019 28th International Conference on Parallel Architectures and Compilation Techniques (PACT)

Bring together researchers from architecture, compilers, applications and languages to present and discuss innovative research of common interest.


2019 IFIP/IEEE Symposium on Integrated Network and Service Management (IM)

Management of information and communication technology focusing on research, development, integration, standards, service provisioning, and user communities.


2019 International Conference on Information Networking (ICOIN)

The International Conference on Information Networking (ICOIN) will take place in Kuala Lumpur, Malaysia. In past thirty years, computer communication and networking technologies have changed every aspect of our lives and societies. While computer networks have contributed largely to the current IT advancement, it will play a key role in new IT paradigms such as IoT and cloud computing and will be applied to various areas of the upcoming society including industry, business, politics, culture, medicine and so on. The main purpose of ICOIN 2019 is to improve our research by achieving the highest capability and encourage open discussions on computer communication and networking technologies.


2018 14th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)

International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) is a premier international forum for scientists and researchers to present the state of the art of data mining and intelligent methods inspired from nature, particularly biological, linguistic, and physical systems, with applications to computers, circuits, systems, control, robotics, communications, and more.

  • 2017 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)

    International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) is a premier international forum for scientists and researchers to present the state of the art of data mining and intelligent methods inspired from nature, particularly biological, linguistic, and physical systems, with applications to computers, circuits, systems, control, robotics, communications, and more.

  • 2016 12th International Conference on Natural Computation and 13th Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)

    International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD) is a premier international forum for scientists and researchers to present the state of the art of data mining and intelligent methods inspired from nature, particularly biological, linguistic, and physical systems, with applications to computers, circuits, systems, control, robotics, communications, and more.

  • 2015 12th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

    FSKD is an international forum on fuzzy systems and knowledge discovery. Specific topics include fuzzy sets, Bioinformatics and Bio-Medical Informatics, Genomics, Proteomics, Big Data, Databases and Applications, Semi-Structured/Unstructured Data Mining, Multimedia Mining, Web and Text Data Mining, Graphic Model Discovery, Data Warehousing and OLAP, Pattern Recognition and Diagnostics, etc..

  • 2014 11th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

    FSKD is an international forum on fuzzy systems and knowledge discovery. Specific topics include fuzzy sets, rough sets, Statistical methods, Parallel/ Distributed data mining, KDD Process and human interaction, Knowledge management, Knowledge visualization, Reliability and robustness, Knowledge Discovery in Specific Domains, High dimensional data, Temporal data, Data streaming, Scientific databases, Semi-structured/unstructured data, Multimedia, Text, Web and the Internet, Graphic model discovery, Software warehouse and software mining, Data engineering, Communications and networking, Software engineering, Distributed systems and computer hardware

  • 2013 10th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

    FSKD is an international forum on fuzzy systems and knowledge discovery. Specific topics include fuzzy sets, rough sets, Statistical methods, Parallel/ Distributed data mining, KDD Process and human interaction, Knowledge management, Knowledge visualization, Reliability and robustness, Knowledge Discovery in Specific Domains, High dimensional data, Temporal data, Data streaming, Scientific databases, Semi-structured/unstructured data, Multimedia, Text, Web and the Internet, Graphic model discovery, Software warehouse and software mining, Data engineering, Communications and networking, Software engineering, Distributed systems and computer hardware, etc.

  • 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

    FSKD is an international forum on fuzzy systems and knowledge discovery. Specific topics include fuzzy theory and foundations; stability of fuzzy systems; fuzzy methods and algorithms; fuzzy image, speech and signal processing; multimedia; fuzzy hardware and architectures; data mining.

  • 2011 Eighth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

    FSKD is an international forum on fuzzy systems and knowledge discovery. Specific topics include fuzzy theory and foundations; stability of fuzzy systems; fuzzy methods and algorithms; fuzzy image, speech and signal processing; multimedia; fuzzy hardware and architectures; data mining.

  • 2010 Seventh International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

    FSKD is an international forum on fuzzy systems and knowledge discovery. Specific topics include fuzzy theory and foundations; stability of fuzzy systems; fuzzy methods and algorithms; fuzzy image, speech and signal processing; multimedia; fuzzy hardware and architectures; data mining.

  • 2007 International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

    FSKD '07 covers all aspects of fuzzy systems and knowledge discovery, including recent theoretical advances and interesting applications, for example, fuzzy theory and models, mathematical foundation of fuzzy systems, fuzzy image/signal processing, fuzzy control and robotics, fuzzy hardware and architectures, fuzzy systems and the internet, fuzzy optimization and modeling, fuzzy decision and support, classification, clustering, statistical methods, knowledge etc.


More Conferences

Periodicals related to Mapreduce

Back to Top

Computer

Computer, the flagship publication of the IEEE Computer Society, publishes peer-reviewed technical content that covers all aspects of computer science, computer engineering, technology, and applications. Computer is a resource that practitioners, researchers, and managers can rely on to provide timely information about current research developments, trends, best practices, and changes in the profession.


Computers, IEEE Transactions on

Design and analysis of algorithms, computer systems, and digital networks; methods for specifying, measuring, and modeling the performance of computers and computer systems; design of computer components, such as arithmetic units, data storage devices, and interface devices; design of reliable and testable digital devices and systems; computer networks and distributed computer systems; new computer organizations and architectures; applications of VLSI ...


Industrial Informatics, IEEE Transactions on

IEEE Transactions on Industrial Informatics focuses on knowledge-based factory automation as a means to enhance industrial fabrication and manufacturing processes. This embraces a collection of techniques that use information analysis, manipulation, and distribution to achieve higher efficiency, effectiveness, reliability, and/or security within the industrial environment. The scope of the Transaction includes reporting, defining, providing a forum for discourse, and informing ...


Information Theory, IEEE Transactions on

The fundamental nature of the communication process; storage, transmission and utilization of information; coding and decoding of digital and analog communication transmissions; study of random interference and information-bearing signals; and the development of information-theoretic techniques in diverse areas, including data communication and recording systems, communication networks, cryptography, detection systems, pattern recognition, learning, and automata.


Intelligent Transportation Systems, IEEE Transactions on

The theoretical, experimental and operational aspects of electrical and electronics engineering and information technologies as applied to Intelligent Transportation Systems (ITS). Intelligent Transportation Systems are defined as those systems utilizing synergistic technologies and systems engineering concepts to develop and improve transportation systems of all kinds. The scope of this interdisciplinary activity includes the promotion, consolidation and coordination of ITS technical ...


More Periodicals

Most published Xplore authors for Mapreduce

Back to Top

Xplore Articles related to Mapreduce

Back to Top

h-MapReduce: A Framework for Workload Balancing in MapReduce

2013 IEEE 27th International Conference on Advanced Information Networking and Applications (AINA), 2013

The big data analytics community has accepted MapReduce as a programming model for processing massive data on distributed systems such as a Hadoop cluster. MapReduce has been evolving to improve its performance. We identified skewed workload among workers in the MapReduce ecosystem. The problem of skewed workload is of serious concern for massive data processing. We tackled the workload balancing ...


Notice of Violation of IEEE Publication Principles<br>Pipelined-MapReduce: An Improved MapReduce Parallel Programing Model

2011 Fourth International Conference on Intelligent Computation Technology and Automation, 2011

MapReduce is a parallel programming model, and used to handle large datasets. The MapReduce program can be automatically concurrent executed in large-scale commodity machines. We proposed an improved MapReduce programming model- Pipelined-MapReduce, to solve the data intensive of information retrieval problems. Pipelined-MapReduce allows data transfer by pipeline between the operations, expanding the batched MapReduce programming model, and can reduce the ...


In-cache MapReduce: Leverage Tiling to Boost Temporal Locality-Sensitive MapReduce Computations

2016 IEEE International Conference on Cluster Computing (CLUSTER), 2016

The MapReduce framework is being increasingly used in the scientific computing and image/video processing fields. Relevant research has tailored it for the field's specificities but there are still overwhelming limitations when it comes to temporal locality-sensitive computations. The performance of this class of computations is closely tied to an efficient use of the memory hierarchy, concern that is not yet ...


A Semantic++ MapReduce: A Preliminary Report

2014 IEEE International Conference on Semantic Computing, 2014

Big data processing is one of the hot scientific issues in the current social development. MapReduce is an important foundation for big data processing. In this paper, we propose a semantic++ MapReduce. This study includes four parts. (1) Semantic++ extraction and management for big data. We will do research about the automatically extracting, labeling and management methods for big data's ...


A New Approach to the Cloud-Based Heterogeneous MapReduce Placement Problem

IEEE Transactions on Services Computing, 2016

Guaranteeing quality of service (QoS) with minimum computation cost is the most important objective of cloud-based MapReduce computations. Minimizing the total computation cost of cloud-based MapReduce computations is done through MapReduce placement optimization. MapReduce placement optimization approaches can be classified into two categories: homogeneous MapReduce placement optimization and heterogeneous MapReduce placement optimization. It is generally believed that heterogeneous MapReduce placement ...


More Xplore Articles

Educational Resources on Mapreduce

Back to Top

IEEE.tv Videos

No IEEE.tv Videos are currently tagged "Mapreduce"

IEEE-USA E-Books

  • h-MapReduce: A Framework for Workload Balancing in MapReduce

    The big data analytics community has accepted MapReduce as a programming model for processing massive data on distributed systems such as a Hadoop cluster. MapReduce has been evolving to improve its performance. We identified skewed workload among workers in the MapReduce ecosystem. The problem of skewed workload is of serious concern for massive data processing. We tackled the workload balancing issue by introducing a hierarchical MapReduce, or h-MapReduce for short. h-MapReduce identifies a heavy task by a properly defined cost function. The heavy task is divided into child tasks that are distributed among available workers as a new job in MapReduce framework. The invocation of new jobs from a task poses several challenges that are addressed by h-MapReduce. Our experiments on h-MapReduce proved the performance gain over standard MapReduce for data-intensive algorithms. More specifically, the increase of the performance gain is exponential in terms of the size of the networks. In addition to the exponential performance gains, our investigations also found a negative effect of deploying h-MapReduce due to an inappropriate definition of heavy tasks, which provides us a guideline for an effective application of h-MapReduce.

  • Notice of Violation of IEEE Publication Principles<br>Pipelined-MapReduce: An Improved MapReduce Parallel Programing Model

    MapReduce is a parallel programming model, and used to handle large datasets. The MapReduce program can be automatically concurrent executed in large-scale commodity machines. We proposed an improved MapReduce programming model- Pipelined-MapReduce, to solve the data intensive of information retrieval problems. Pipelined-MapReduce allows data transfer by pipeline between the operations, expanding the batched MapReduce programming model, and can reduce the completion time, and improve the system utilization rate. The experimental results demonstrate that the implementation of Pipelined-MapReduce can scale well and efficiently process large datasets on commodity machines.

  • In-cache MapReduce: Leverage Tiling to Boost Temporal Locality-Sensitive MapReduce Computations

    The MapReduce framework is being increasingly used in the scientific computing and image/video processing fields. Relevant research has tailored it for the field's specificities but there are still overwhelming limitations when it comes to temporal locality-sensitive computations. The performance of this class of computations is closely tied to an efficient use of the memory hierarchy, concern that is not yet taken into consideration by the existing distributed MapReduce runtimes. Consequently, implementing temporal locality- sensitive computations, such as stencil algorithms, on top of MapReduce is a complex chore not rewarded with proportional dividends. This paper tackles both the complexity and the performance issues by integrating tiling techniques and memory hierarchy information into MapReduce's split stage. We prototyped our proposal atop the Apache Hadoop framework, and applied it to the context of stencil computations. Our experimental results reveal that, for a typical stencil computation, our prototype clearly outperforms Hadoop MapReduce, specially as the computation scales.

  • A Semantic++ MapReduce: A Preliminary Report

    Big data processing is one of the hot scientific issues in the current social development. MapReduce is an important foundation for big data processing. In this paper, we propose a semantic++ MapReduce. This study includes four parts. (1) Semantic++ extraction and management for big data. We will do research about the automatically extracting, labeling and management methods for big data's semantic++ information. (2) SMRPL (Semantic++ MapReduce Programming Language). It is a declarative programming language which is close to the human thinking and be used to program for big data's applications. (3) Semantic++ MapReduce compilation methods. (4) Semantic++ MapReduce computing technology. It includes three parts. 1) Analysis of semantic++ index information of the data block, the description of the semantic++ index structure and semantic++ index information automatic loading method. 2) Analysis of all kinds of semantic++ operations such as semantic++ sorting, semantic++ grouping, semantic+++ merging and semantic++ query in the map and reduce phases. 3) Shuffle scheduling strategy based on semantic++ techniques. This paper's research will optimize the MapReduce and enhance its processing efficiency and ability. Our research will provide theoretical and technological accumulation for intelligent processing of big data.

  • A New Approach to the Cloud-Based Heterogeneous MapReduce Placement Problem

    Guaranteeing quality of service (QoS) with minimum computation cost is the most important objective of cloud-based MapReduce computations. Minimizing the total computation cost of cloud-based MapReduce computations is done through MapReduce placement optimization. MapReduce placement optimization approaches can be classified into two categories: homogeneous MapReduce placement optimization and heterogeneous MapReduce placement optimization. It is generally believed that heterogeneous MapReduce placement optimization is more effective than homogeneous MapReduce placement optimization in reducing the total running cost of cloud-based MapReduce computations. This paper proposes a new approach to the heterogeneous MapReduce placement optimization problem. In this new approach, the heterogeneous MapReduce placement optimization problem is transformed into a constrained combinatorial optimization problem and is solved by an innovative constructive algorithm. Experimental results show that the running cost of the cloud-based MapReduce computation platform using this new approach is 24.3-44.0 percent lower than that using the most popular homogeneous MapReduce placement approach, and 2.0-36.2 percent lower than that using the heterogeneous MapReduce placement approach not considering the spare resources from the existing MapReduce computations. The experimental results have also demonstrated the good scalability of this new approach.

  • Scheduling MapReduce tasks on virtual MapReduce clusters from a tenant's perspective

    Renting a set of virtual private servers (VPSs for short) from a VPS provider to establish a virtual MapReduce cluster is cost-efficient for a company/organization. To shorten job turnaround time and keep data locality as high as possible in this type of environment, this paper proposes a Best-Fit Task Scheduling scheme (BFTS for short) from a tenant's perspective. BFTS schedules each map task to a VPS that can finish the task earlier than the other VPSs by predicting and comparing the time required by every VPS to retrieve the map-input data, execute the map task, and become idle in an online manner. Furthermore, BFTS schedules each reduce task to a VPS that is close to most VPSs that execute the related map tasks. We conduct extensive experiments to compare BFTS with several scheduling algorithms employed by Hadoop. The experimental results show that BFTS is better than the other tested algorithms in terms of map-data locality, reduce-data locality, and job turnaround time. The overhead incurred by BFTS is also evaluated, which is inevitable but acceptable compared with the other algorithms.

  • D3-MapReduce: Towards MapReduce for Distributed and Dynamic Data Sets

    Since its introduction in 2004 by Google, MapReduce has become the programming model of choice for processing large data sets. Although MapReduce was originally developed for use by web enterprises in large data-centers, this technique has gained a lot of attention from the scientific community for its applicability in large parallel data analysis (including geographic, high energy physics, genomics, etc.). So far MapReduce has been mostly designed for batch processing of bulk data. The ambition of D3-MapReduce is to extend the MapReduce programming model and propose efficient implementation of this model to: i) cope with distributed data sets, i.e. that span over multiple distributed infrastructures or stored on network of loosely connected devices, ii) cope with dynamic data sets, i.e. which dynamically change over time or can be either incomplete or partially available. In this paper, we draw the path towards this ambitious goal. Our approach leverages Data Life Cycle as a key concept to provide MapReduce for distributed and dynamic data sets on heterogeneous and distributed infrastructures. We first report on our attempts at implementing the MapReduce programming model for Hybrid Distributed Computing Infrastructures (Hybrid DCIs). We present the architecture of the prototype based on BitDew, a middleware for large scale data management, and Active Data, a programming model for data life cycle management. Second, we outline the challenges in term of methodology and present our approaches based on simulation and emulation on the Grid'5000 experimental testbed. We conduct performance evaluations and compare our prototype with Hadoop, the industry reference MapReduce implementation. We present our work in progress on dynamic data sets that has lead us to implement an incremental MapReduce framework. Finally, we discuss our achievements and outline the challenges that remain to be addressed before obtaining a complete D3-MapReduce environment.

  • An efficient Frequent Patterns Mining Algorithm based on MapReduce Framework

    Recently, data collected from business have continuously growing in every enterprise. The Big Data, Cloud Computing, Data Mining has become hot topics at the present day. How to acquire important information quickly from these data is a critical issue. In this paper, we modified the traditional Apriori algorithm by improving the execution efficiency, since Aprori algorithm has confronted with a drawback that the computation time increases dramatically when data size increases. Since the one-phase algorithm only used one MapReduce operation, it will generate excessive candidates and result in insufficient memory. We design and implement an efficient algorithm: Frequent Patterns Mining Algorithm Based on MapReduce Framework (FAMR). We adopt Hadoop MapReduce as the experiment platform. The experiment results have shown that FAMR has 16.2 speedup at last in the running time compared with one-phase algorithm.

  • Improving Hadoop MapReduce performance with data compression: A study using wordcount job

    Hadoop cluster is widely used for executing and analyzing a large data like big data. It has MapReduce engine for distributing data to each node in cluster. Compression is a benefit way of Hadoop cluster because it not only can increase space of storage but also improve performance to compute job. Recently, there are some popular Hadoop's compression codecs for example; deflate, gzip, bzip2 and snappy. An over-all compression in MapReduce, Hadoop uses a compressed input file which is gzip and bzip2. This research goal is to improve a computing performance of wordcount job using a different Hadoop compression option. We have 2 scenarios had been test in a study as follows: Scenario I, we use data compression with map output, results found the better execution-time with only snappy and deflate in a raw-text input file. It refers to compression of map output which cans not improve a computing performance than uncompressed. Scenario II, we use a compressed input file with bzip2 with the uncompressed MapReduce that results find a similar execution-time between raw-text and bzip2. It refers to a bzip2 input file can reduce a disk space and keep a computing performance. In concluding, Hadoop compression can investigate the wordcount MapReduce execution-time with a bzip2 input file in Hadoop cluster.

  • MapReduce-Based RESTMD: Enabling Large-Scale Sampling Tasks with Distributed HPC Systems

    A novel implementation of Replica Exchange Statistical Temperature Molecular Dynamics (RESTMD), belonging to a generalized ensemble method and also known as parallel tempering, is presented. Our implementation employs the MapReduce (MR)-based iterative framework for launching RESTMD over high performance computing (HPC) clusters including our test bed system, Cyber-infrastructure for Reconfigurable Optical Networks (CRON) simulating a network-connected distributed system. Our main contribution is a new implementation of STMD plugged into the well-known CHARMM molecular dynamics package as well as the RESTMD implementation powered by the Hadoop that scales out in a cluster and across distributed systems effectively. To address challenges for the use of Hadoop MapReduce, we examined contributing factors on the performance of the proposed framework with various runtime analysis experiments with two biological systems that differ in size and over different types of HPC resources. Many advantages with the use of RESTMD suggest its effectiveness for enhanced sampling, one of grand challenges in a variety of areas of studies ranging from chemical systems to statistical inference. Lastly, with its support for scale-across capacity over distributed computing infrastructure (DCI) and the use of Hadoop for coarse-grained task-level parallelism, MapReduce-based RESTMD represents truly a good example of the next-generation of applications whose provision is increasingly becoming demanded by science gateway projects, in particular, backed by IaaS clouds.



Standards related to Mapreduce

Back to Top

No standards are currently tagged "Mapreduce"


Jobs related to Mapreduce

Back to Top